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We investigate general probabilistic theories in which every mixed state has a purification, unique 
up to reversible channels on the purifying system. We show that the purification principle is equiv- 
alent to the existence of a reversible realization of every physical process, that is, to the fact that 
every physical process can be regarded as arising from a reversible interaction of the system with 
an environment, which is eventually discarded. From the purification principle we also construct 
an isomorphism between transformations and bipartite states that possesses all structural proper- 
ties of the Choi-Jamiolkowski isomorphism in quantum theory. Such an isomorphism allows one 
to prove most of the basic features of quantum theory, like e.g. existence of pure bipartite states 
giving perfect correlations in independent experiments, no information without disturbance, no 
joint discrimination of all pure states, no cloning, teleportation, no programming, no bit com- 
mitment, complementarity between correctable channels and deletion channels, characterization of 
entanglement-breaking channels as measure-and-prepare channels, and others, without resorting to 
the mathematical framework of Hilbert spaces. 
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I. INTRODUCTION 

In the past two decades the field of quantum informa- 
tion theory has brought to light an enormous amount 
of protocols and tasks that originate from the structure 
of quantum theory and have dramatic consequences in 
the way information can be processed. Non-locality, no- 
cloning, teleportation, dense coding, quantum key distri- 
bution, quantum algorithms, and quantum error correc- 
tion are only the most celebrated examples of a much 
longer list. An important lesson from this experience is 
that the abstract formalism of quantum mechanics has a 
huge number of operational consequences. 

At the same time, the question whether quantum the- 
ory is the only conceivable theory with such operational 
consequences has attracted the attention of an increasing 
number of researchers. In a seminal paper Popescu 
and Rohrlich showed that non-locality is not an ex- 
clusive feature of quantum theory, and that there are 
in fact possible theories that exhibit stronger nonlocal- 
ity than quantum theory without violating relativistic 
no-signaling. An intense work on non-locality in gen- 
eral non-signaling theories has followed this observation, 
opening a very active line of research (see e.g. 
On the other hand, the authors of Refs. [||, Q have 
analyzed tasks like cloning and broadcasting of states, 
showing that the impossibility of achieving them is a 
highly generic property, while Ref. |8| thoroughly dis- 
cussed theories with a local discriminability property that 



share other features of quantum mechanics, like the non- 
unique convex decomposition of a mixed state or the non- 
existence of ideal non-disturbing measurements. Entan- 
glement swapping and teleportation protocols have been 
considered in Refs |l^ , where the authors noticed the 
remarkable fact that the no-signaling boxes of Popescu 
and Rohrlich do not allow for entanglement swapping, 
nor for teleportation. Very recently, the authors of Rcf. 
pl| have introduced the new physical principle of infor- 
mation causality, showing that while the principle holds 
for quantum theory, it is violated by Popescu-Rohrlich 
boxes. 

Despite the numerous advancements in the under- 
standing of general probabilistic theories, the fundamen- 
tal problem of deriving quantum mechanics from basic 
physical principles is still completely open. In particular, 
no physical principle is known that can single out quan- 
tum mechanics in the physically motivated set of causal 
theories with local discriminability. With this expression 
we mean probabilistic theories where i) the probability 
of outcomes of an experiment performed at a given time 
does not depend on the choice of experiments that will be 
performed at later times, and ii) if two bipartite states 
are different, then one can discriminate between them 
using only local devices with an error probability that is 
smaller than 1/2, the random guess value. In the case of 
classical physics, finding a description is relatively simple: 
among theories in the above family, classical probability 
is the only one where all pure states are perfectly distin- 
guishable. On the contrary, every current description of 
quantum theory is a description of its mathematical ap- 
paratus: e.g. one can say that quantum theory is the the- 
ory where pure states are unit vectors in complex Hilbert 
spaces and probabilities are given by the Born rule, or, 
equivalently, that it is the theory where observablcs form 
a C*-algebra of complex matrices. 

In the past there have been many attempts to find a 
more basic description of quantum theory, in particular 
by discussing it from the point of view of logic p2-15 
(see also Ref. jl^ and references therein) . More recently. 
Hardy Q has approached the problem from a differ- 
ent perspective, providing a characterization of quantum 
theory based on principles of mathematical simplicity in 
the interplay among dimension of the state space, struc- 
ture of subsystems and subspaces, number of distinguish- 
able states, and topology of the set of pure states. On 
the other hand, in recent years one of the authors has 
tackled the problem using physical principles related to 
tomography and calibration of physical devices, experi- 
mental complexity, and to the composition of elementary 
(atomic) transformations (see Ref. for the state of 
the art of this project). In particular, Ref. firstly in- 
troduced the concept of dynamically and preparationally 
faithful state, which will play an important role in this 
paper. 

In this paper we introduce the purification principle 
"Every mixed state has a purification, unique up to re- 
versible channels on the purifying system" . The main 
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message of our work is simple: most of the characteris- 
tic features of quantum theory can be summarized in the 
physical statement "quantum theory is a causal theory 
with purification and local discriminability" . In particu- 
lar, from the purification principle we derive the following 
features: no information without disturbance, no joint 
discriminability and no cloning of pure states, existence 
of pure entangled states with perfect correlations, proba- 
bilistic teleportation, one-to-one correspondence between 
transformations and bipartite states, dilation of physical 
processes to reversible interactions with an environment, 
necessary and sufficient conditions for error correction in 
terms of the reversible dilation, no bit commitment, no 
programming of reversible channels without perfectly dis- 
tinguishable program states, and identification of causal 
channels with sequences of channels with memory, and 
characterization of entanglement breaking channels as 
measure-and-prepare channels. Moreover, we also discuss 
a stronger version of the purification principle: "For ev- 
ery system A there exists a conjugate system A such that 
every state of A has a purification in AA. The conjugate 
of A is A (symmetry), and the conjugate of a composite 
system AB is the composite system AB (regularity under 
composition)". With this further property one can prove 
deterministic teleportation and show that its structure is 
unique: the resource state for deterministic teleportation 
must be a purification of the unique mixed state that is 
invariant under all reversible channels. 

As we will show, the purification principle is equivalent 
to the fact that every irreversible process arises from a 
reversible interaction with an environment that is eventu- 
ally lost. This can be viewed as a law of "conservation of 
information" : information cannot be erased, it can only 
be discarded. Moreover, we will see that the purifica- 
tion principle has other remarkable consequences: From 
the structural point of view, a theory with purification is 
completely identified by the states of all possible systems 
in it. Once the states are given, all possible measure- 
ments and evolutions are fixed. Even more strongly, the 
purification postulate implies the completeness property 
"whatever transformation is mathematically admissible 
(in a sense that will be made precise later) must be fea- 
sible". Conversely, we can explicitly say that whatever 
limitation to the feasibility of a mathematically admissi- 
ble map results in a limitation to the purifiability of some 
state. The analogue of this property in quantum infor- 
mation is that every trace-preserving completely positive 
map must be feasible. 

It is important to stress that we are not claiming that 
we derived quantum theory. What we can say is that 
wc "zipped" a large part of it, by reducing a long list of 
features to a single physical principle. In the process of 
doing this, wc found proofs that are often simpler (or at 
least more intuitive) than the original quantum proofs. 

In order to minimize the notational burden due to the 
lack of a commonly established formalism, in presenting 
these proofs we opted for a graphical notation, which is 
equivalent to formulae and replaces them in most of the 



paper. Since this notation is exactly the same notation 
used in quantum circuits, a reader with a background in 
quantum information can easily read the general equa- 
tions without spending too much time in the introduc- 
tory part of the paper. On the other hand, an extended 
discussion on graphical calculus can be found in the work 
by Penrose pO| and in the rigorous formalization by Joyal 
and Street within the theory of symmetric monoidal cat- 
egories (we also suggest the beautiful introductions 
in the topic by Selinger [g2| and Coecke Q ) . We anyway 
stress that in the present paper the choice of graphical 
notation is just the choice of a more user-friendly way 
of presenting formulae, and that no prerequisite on e.g. 
category theory is needed from the reader. 



II. OPERATIONAL-PROBABILISTIC 
THEORIES 

In this Section wc introduce some basic notions that 
will be used in the paper. In particular, we introduce 
the notion of operational-probabilistic theory as a theory 
that i) describes a set of possible experiments that can be 
done with physical devices and ii) gives predictions about 
the probabilities of the outcomes in these experiments. 



A. Systems and tests 

Systems and tests are the primitive notions of an oper- 
ational theory. Each test represents one use of a physical 
device, like a Stcrn-Gerlach magnet, a beamsplitter, or 
a photon counter. Systems play the role of labels at- 
tached to physical devices: any device has an input and 
an output port labeled by an output and an input system, 
respectively. These labels establish a rule for connecting 
physical devices among themselves: two devices can be 
connected in a sequence only if the output of the first 
device is a system of the same type as the input of the 
second. 

All throughout the paper we will denote systems with 
capital letters, like A,B,C, and so on. We reserve the 
letter I for the trivial system, which simply means "noth- 
ing" . A device with input (output) system I is a device 
with no input (no output). 

Let us now make more precise the notion of test. We 
already mentioned that a test represents one use of a 
physical device. When the physical device is used, it 
produces an outcome i in some set X, e.g. the outcome 
could be a sequence of digits appearing on a display, a 
light, or a sound emitted by the device. The outcome 
produced by the device heralds the fact that some event 
has occurred. These intuitive features concur in the def- 
inition of test: 

Definition 1 (Test) A test with input system A and 
output system B is a collection of events {^i}igx labeled 
by outcomes in some outcome set X. Diagrammatically, 
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the test {'^i^i^y^ is represented as follows 



B. Sequential composition of tests 



A 



ex 



while the specific event '^i is represented by 
A 



(1) 



(2) 



We denote by T(A, B) the set of all events appearing in 
all tests from A to B. When B = A we will write 'X(A). 

Tests with trivial input will be called preparation-tests, 
and the corresponding events will be called preparation- 
events. In quantum information, a preparation-test is 
what is called a "random source of quantum states" . In 
analogy we will adopt for preparation-events the usual 
notation as for states in quantum circuits: 



I 



(3) 



In formulae, we will often use the "Dirac-like" notation 
\pi)B to denote a preparation event of system B. We will 
denote by 6(A) the set of preparation-events for system 
A, namely 6(A) := T(I,A). 

Similarly, we will call tests with trivial out- 
put observation-tests, and the corresponding events 
observation- events. In quantum theory, an observation- 
test is a quantum measurement, and is represented 
by positive operator valued measure (POVM), that is, 
by a collection of positive operators {Pijigx satisfying 
^^gjr Pj = Ja, where /a is the identity on the Hilbert 
space of system A. For observation-tests we will then 
adopt the usual notation for measurements in quantum 
circuits: 



^7 



(4) 



In formulae, we will often denote observation-events with 
the notation (ojIa- We wiU denote by (£(A) the set of 
observation-events for system A, namely €(A) := T(A, I). 

For tests from the trivial system to itself we will omit 
the box and the wires, as follows: 



Pk 



Pk 



(5) 



In Subsect. II F we will interpret events from the trivial 
system to itself as probabilities. 

Another important case of tests is that of single- 
outcome tests, in which the outcome space X consists 
of a single element: X = {io}. Whenever a device repre- 
sented by a single-outcome test is used, the experimenter 
is sure that only one event can take place. This motivates 
the following definition: 

Definition 2 (Deterministic tests) A test is deter- 
ministic if its outcome set has a single element, namely 
IXI = 1. 



Physical devices can be used in sequences, as long as 
the output of each device coincides with the input of the 
next one. When two tests are composed in a sequence 
we obtain a new test, as in the following 

Definition 3 (Sequential composition of tests) // 

{'^i}ii£X is a test from A to B and {S>j}j^Y is a test 
from B to C, then their sequential composition is test 
from A to C, with outcomes {i,j) € X x Y, and events 
{^^j °'^i}{ij)eXxY ■ Diagrammatically, the events o'^^ 
are represented as follows 



A 




B 


^3 


C 




A 

















St, o % 



(6) 



[II 



within the frame- 



We win say that test {Sj} "follows" test {%}, or, 
equivalently, {^i} "precedes" {Sj}. For the moment, the 
order of composition is not necessarily temporal. The in- 
terpretation of sequential composition as a sequence of 
time-steps will be given in Subsect. 
work of causal theories. 

The sequential composition of tests brings immediately 
the notion of identity test. 

Definition 4 (Identity test) The identity test for sys- 
tem A is a test with a single event ^a such that for every 
system B 



B 




A 




A 




S, 




J 





_ B 



S. 



V"^, 6 T(A, B) 

ySj G T(B, A) 
(7) 



Performing the identity test on a system just means "do- 
ing nothing" on it. We can think of the outcome of the 
identity test as a blank character, which provides no in- 
formation. 

In some protocols, such as teleportation, one wants to 
emphasize that one is dealing with two different systems 
"of the same type" . For examples, in quantum theory one 
can have two electrons in different (spatially separated) 
regions. Distinguishing two systems of the same type is 
essentially a matter of bookkeeping. Moreover, we can 
have different physical systems that are "operationally 
equivalent" , e. g. the polarization of a single photon and 
the spin of an electron in quantum theory are both rep- 
resented by a qubit, and can be (at least in principle) 
converted one to another in a reversible fashion. For 
this reason we introduce a formal notion of operational 
equivalence between systems, based on their mutual con- 
vertibility: 

Definition 5 (Operationally equivalent systems) 

Two systems A and A' are operationally equivalent — 
denoted as A' ~ A — if there exist a deterministic test 
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{J^A,A'} from A to A' and a deterministic test {^a',a} 
from A' to A, respectively, such that 



A' 



(8) 



A' 




A 


J 


A' 




A' 













A' 



Accordingly, if {^ij^gx is a test for system A, performing 
the "same test" on system A' means performing the test 
{^'}jgx defined by 



A' 




A' 




A' 













(9) 



Clearly, the above notion of "same test on a differ- 
ent system" depends on the choice of the privileged test 
{^A,A'} used to set up the operational equivalence be- 
tween A and A' . We will often drop the primes and write 
'£i instead of ^l. 



Physical devices can be run in parallel on different sys- 
tems, thus performing a test on the composite system, as 
in the following 

Definition 7 (Parallel composition of tests) // 

{^i}igx a test from A io B and {&j}j^Y is a test 
from Q to Y>, then their parallel composition is the test 
from AC to BD, with outcomes {i,j) G X x Y, and 
events {'^i ^j}(i,j)^xxY ■ Diagrammatically the events 
^(ai ® S!j are represented as follows 



A 




B A 












C 




D C 







% (E) 3), 



(12) 



If %,S!j,iffk,-^i are events from A to B, B to C, D 
to E, and E to F, respectively, their parallel composition 
enjoys the property 



C. Composite systems and parallel composition of 

tests 

Given two systems A and B, one can consider them 
together, thus forming the corresponding composite sys- 
tem, here denoted by AB. A test with input (output) 
system AB (CD), represents one use of a physical device 
with two input (output) ports, labeled by A and B (C 
and D), respectively. 

Definition 6 (Composite system) // A, B are sys- 
tems, the corresponding composite system is AB. Com- 
position of systems enjoys the properties i) A = lA = AI, 
ii) AB ~ BA, and iii) A(BC) = (AB)C ;= ABC. 

Diagrammatically, an event from AB to CD is repre- 
sented as a box with multiple wires: 



A 




C 


B 


D 







AB 



CD 



(10) 



The property ij in Def. ^ expresses the fact that system 
A together with "nothing" is still system A, while prop- 
erties ii) and iii) express the fact that the specification 
of a composite system depends only on the list of com- 
ponent systems, and not on how the elements of the list 
are ordered (up to operational equivalence, implemented 
by a deterministic test that permutes the component sys- 
tems), nor on how they are grouped. 

In general, we will represent the iV-partite composite 
system Ai . . . Ajv with N wires, as follows: 



Ai 



AiA2...Af, 



(11) 



Ajv 



In the case of trivial systems, we will typically omit the 
wire. In the sequential composition of two boxes with 
multiple wires we will always match the output wires of 
the first box with the input wires of the second. 



c 


A 




B 




C 













F 


D 


S'k 


E 




F 









(13) 



Note that property (|13[) implies that tests on differ- 
ent systems commute, that is, for every couple of events 



A 




B A 














C 




D C 









(14) 



c 




D 




D 









From now on, in diagrams like the above we will typically 
omit the box with identity test, leaving just a wire for 
the corresponding system. Also in formulae we will often 
omit the identity, e.g. for € T(A, B) and p G 6 (AC) 
we will often write ^ |p)ab in place of (^ J^c) Ip)ac- 

Note that the difference between parallel and sequen- 
tial composition of two tests is already encoded in their 
input and output spaces: if the input of a test is the out- 
put of the other the composition is sequential, if all spaces 
are distinct the composition is parallel. For this reason, 
when the kind of composition is evident we will omit the 
symbols o and For example, if p is a preparation-event 
for A and is an event from A to B we will write |p)a 
in place of o |p)a, whereas if p and a are preparation- 
events for A and B, respectively, we will write |p)a |cr)B 
in place of |p)a |o')b- 



D. Operational theories 

We are now in position to make more precise the notion 
"operational theory" : 
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Definition 8 (Operational theory) An operational 
theory is specified by a collection of systems, closed 
under composition, and by a collection of tests, closed 
under parallel and sequential composition. 

In an operational theory one can draw circuits that i) 
represent the connections of physical devices in an exper- 
iment, like e.g. the circuit 



(M 



(15) 



and a) can also represent which specific set of events took 
place in the experiment, like e.g. the circuit 



(16) 



In particular, the latter circuit represents the 
preparation-event pi followed by the event from 
system A to system B, which is in turn followed by the 
observation-event Ok on system B. The whole sequence 
can be seen as single event pkji := {akl^^^j \pi)A from 
the trivial system to itself. 



E. Relation with category theory 

In the previous Subsections we presented in an infor- 
mal way the basic notions pertaining to the use of phys- 
ical devices in sequences and in parallel. More formally, 
these notions can be summarized with the language of 
category theory , which provides the suitable mathe- 
matical framework capturing the fundamental structure 
presented so far. In this language, an operational theory 
is a category, where systems and events are respectively 
objects and arrows. Every arrow has an input and an 
output object, and arrows can be sequentially composed. 
A test is then a collection of arrows labeled by outcomes. 

The fact that in an operational theory we have a par- 
allel composition of systems, and that such a composi- 
tion is symmetric (i.e. AB ~ BA) is expressed in tech- 
nical words by saying that we have a strict symmet- 
ric monoidal category [^^. In the next Subsection we 
will specify more requirements on this category, impos- 
ing that the scalars (arrows from the trivial system to 
itself) are probabilities. 



Probabihstic structure: states, effects, and 
transformations 



An operational theory is a language, whose words are 
diagrams representing circuits. With this language one 
can give instructions to build up experiments or, alter- 
natively, one can graphically represent which particular 
outcomes took place in an experiment. However, in a 
physical theory one wants more: one wants to give prob- 
abilistic predictions about the occurrence of possible out- 
comes. To have this, there must be a rule assigning a 



probability to every event from the trivial system to it- 
self . More directly, we can say that in a probabilistic 
theory the events from the trivial system to itself are 
probabilities, as in the following 

Definition 9 (Operational-probabilistic theory) 

An operational theory is probabilistic if for every test 
{Pi}i<£X from the trivial .system I to itself one has 
Pi G [0, 1] and X^igxP' ~ ^' ^^'^ composition of two 
events from the trivial system to itself is given by the 
product of probabilities: Pi qj ~ Pi o qj = piqj. 

For short, we will often refer to operational- 
probabilistic theories simply as probabilistic theories. 

In a probabilistic theory, a preparation-event pi for 
system A defines a function pi sending observation-events 
of A to probabilities: 



A; : (S(A) ^ [0,1], {aj\ ^ {a,\ p,). 



(17) 



Likewise, an observation-event Oj defines a function aj 
from preparation-events to probabilities 



:©(A)^[0,1], \p,)^{a,\p,). 



(18) 



From a probabilistic point of view, two observation- 
events (preparation-events) corresponding to the same 
function arc indistinguishable. This leads to the notions 
of states and effects (see [|l^, ^): 

Definition 10 (States and effects) Equivalence 
classes of indistinguishable preparation-events are 
called states. Equivalence classes of indistinguishable 
observation-events are called effects. 

From now on we will identify preparation-events 
with states and observation-events with effects, with- 
out keeping the distinction between an event (aj) 
and the corresponding function pi (oj). Accordingly, 
a preparation(observation)-test will be a collection of 
states (effects), and the sets 6(A), £(A) will be the set of 
states and and the set of effects of system A, respectively. 

Remark (states and efifects in quantum theory). 

In quantum theory systems are associated with Hilbert 
spaces. The deterministic states of a system A are repre- 
sented by density matrices on the corresponding Hilbert 
space: a deterministic state p is a matrix satisfying p > 
and Tr[p] = 1. A non-deterministic preparation-test 
{pi}igx, sometimes called a quantum information source, 
is a collection of positive operators with the property 
^■g-xTr[pi] = 1. Accordingly, the set ©(A) of all states 
of system A is the collection of all unnormalized density 
matrices p with Tr[p] < 1. An effect is represented by 
positive operator P with P < Ia [Ia being the identity 
operator) , and the probability resulting from the pairing 
between a state p and and effect P is given by the Born 
rule: (P| p)a = Tr[Pp]. 

Notice that according to the definition of states and 
effects as equivalence classes, states are separating for 
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effects and effects are separating for states, tliat is, 

\Po)a = Ipi)a {a\po)A = (a|pi)A Va £ €(A) 
(ooIa = (oiIa (ao|/o)A = {ai\p)A Vp G 6(A). 

(19) 

Since states (effects) are functions from effects (states) to 
probabilities, one can take linear combinations of them. 
This defines two real vector spaces 6r(A) and £r(A), 
one dual of the other (we recall that the dual of a real 
vector space V is the real vector space V* of all linear 
functions from V to M). In this paper we will always 
restrict our attention to the case of set of states that 
span finite dimensional vector spaces. In this case, by 
construction one has 



dim(6E(A)) = dim(£R(A)). 



(20) 



Notice that a spanning set for 6r(A) is a separating set 
for i£r(A), while a spanning set for £r(A) is a separating 
set for 6r(A). 

Moreover, linear combinations with positive coeffi- 
cients define two convex cones (5+(A) and €+(A) (we 
recall that a set S* is a cone if for every x G 5 and for 
every A > one has Ax G S, whereas the set is con- 
vex if for every x,y G S and for every p G [0, 1] one has 
px + (1 — p)y G S). Since the pairing between states and 
effects yields positive numbers, one has the inclusions 



£+(A) c e+(A)* 
S+(A) c £+(A)*, 



(21) 



where 6+(A)* and £+(A)* are the dual cones of &+{A) 
and £+(A), respectively. We recall that the dual of a 
cone S in some vector space V is the cone S* defined by 
S* := {X € V* , X{x) >0Va;G5}. 

We conclude this Subsection by noting that every event 
"i^k from A to B induces a linear map from 6r(A) to 
©h(B), uniquely defined by 



^fc:|p)G©(A)^^fc|p)AG6(B). 



(22) 



Likewise, for every system C the event % (8) J^c induces 
a linear map from (5r(AC) to 6r(BC). From a statis- 
tical point of view, if two events and induce the 
same maps for every possible system C, then they are 
indistinguishable. 

Definition 11 (Transformations) Equivalence 
classes of indistinguishable events from A to 3 are 
called transformations from A to B. 

Again, we will assume that the equivalence classes have 
been already done since the start, and, consequently, we 
will identify events with transformations, without intro- 
ducing new notation. Accordingly, a test will be a col- 
lection of transformations. 

Remark (transformations and tests in quantum 
theory). In quantum theory, a transformation is usu- 
ally called quantum operation. Technically speaking, a 



quantum operation from A to B is a linear, completely 
positive, trace non-increasing map sending density ma- 
trices of system A to (unnormalized) density matrices of 
system B. A test {"^ijiex from A to B is typically re- 
ferred to as a quantum instrument , and is a collection 
of quantum operations with the property that X^igx 
is trace-preserving, namely X^iex '^^['^«(/')] ~ Tr[/9] for 
every state p. 

Remark (different transformations). Note that 
two transformations ^ G T(A, B) can be different 
even if ^ \p)a = ^ |p)a for every p G 6(A): indeed to 
make ^(o different from Si it is enough that there exists 
an ancillary system C and a joint state |p)ac such that 
{^i^Jo) \p)kc (^(8) J^c) Ip)ac- We wiU come back on 
this point when discussing local discriminability in Sect. 

The following definitions will be used in the following 

Definition 12 (Channel) A deterministic transforma- 
tion ^€ G T(A, B) is called channel. 

Definition 13 (Reversible channel) A channel ^ G 
T(A, B) is called reversible if there is another channel 
W G T(B, A) such that 



A _ 



(23) 



_ B 



If there exists a reversible channel ^ from A to B, 
then the systems A and B are operationally equivalent, 
in the sense of Def. ||. Note that the reversible channels 
from A to itself form a group. We will denote this group 
by Ga. 

Wc can now consider states that are invariant under 
the group of reversible transformations Ga: 

Definition 14 (Invariant states) A state p G S(A) is 
invariant under the action of the group Ga if 



G G/ 



(24) 



Similarly, we can consider channels with invariant out- 
put, that wc call twirling channels. 

Definition 15 (Tv^rirling channels/Twirling tests) 

A channel G X(A) is a twirling-channel if 



sr 



A _ 



G Ga. 

(25) 

// a test {'fi}i,£x "is such that X^iex ^ twirling chan- 
nel, we call it a twirling test. 

We will see that in a theory with purification there is a 
unique invariant state and a unique twirling channel for 
every system. 
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G. Relation with the convex sets framework 

The standard assumption in the hterature is that, since 
the experimenter is free to randomize the choice of de- 
vices with arbitrary probabihties, all sets of states, ef- 
fects, and transformations are convex. We will call the 
theories satisfying this assumption "convex" . The as- 
sumption of convexity will be clarified in Subscct 



[II D 



in the context of causal theories. Nevertheless, for many 
of our results the assumption of convexity is not essen- 
tial, and we will discuss the validity of our results in 
non-convex theories, like the toy-theories considered by 
Spekkcns in Ref. [p9| . Bearing this in mind, when- 
ever possible we will present our results in a convexity- 
independent language. We will add the specification 
"convex" to the theory for those particular results in 
which convexity is essential. 

In addition to the convexity of all sets of states, ef- 
fects, and transformations, the usual convex sets frame- 
work (see e.g. Refs. ^ and, more recently, 
Refs. |l^) includes an assumption of mathematical 
simplicity. The assumption is that every binary probabil- 
ity rule describes the statistics of a possible two-outcome 
experiment. Precisely, with the expression "probability 
rule" we mean a collection of positive linear functionals 
{aj}iex C ©+(A) such that Y^iex (a«lp)A = 1 for every 
deterministic state p £ 6(A). We will refer to this as- 
sumption as "no-restriction hypothesis" , as it states that 
there is no restriction on the set of (binary) probability 
rules that can be implemented in actual experiments. 

Definition 16 (No-restriction hypotiiesis) A proba- 
bilistic theory satisfies the no-rcstriction hypothesis if 
every binary probability rule {ao,ai} C (3!|_(A) is an 
observation-test. 

In this paper we will not make this assumption. How- 
ever, we will discuss a few implications of it in subsections 
VnDlandl 



H. Coarse-graining and refinement 

Here we give some definitions that will be often used 
in this paper. 

Definition 17 (Coarse-graining) A test {"^ijigx "is a 
coarse- graining of the test {^j}j^Y is there is a partition 
ofY into disjoint sets Yi such that = X^jgv /"'^ 
every i G X. 

Since wc can always decide to join two (or more) out- 
comes in a single outcome, the set of all tests must be 
closed under coarse-graining. 

The inverse of coarse-graining is refinement: 

Definition 18 (Refinement of a test) If{'^i}i^x is a 
coarse-graining of {H^jjj^y, we say that {^j}jeY is a 
refinement o/{'^i}igx- 



Definition 19 (Refinement of an event) A refine- 
ment of the event ^€ is given by a test {&j}j^Y o,nd a 
subset Yo C Y such that ^ = X^jeYo 

Definition 20 We say that an event 3) £ T(A,B) re- 
fines 'ta S 'X(A,B), and write 3l ~< , if there exist a 
refinement of '^ta such that G {&j}j^Yo- 

Definition 21 (Refinement set) The refinement set 
D'g of an event 'la G T(A, B) is the set of all events Si 
that refine , namely ~ {'2) G 1(A,B) \'2j 

Definition 22 (Atomic vs refinable events) An 

event is called atomic if it admits only trivial 
refinements, — equivalently, if S> ^ 'g implies & = Xg 
for some A G [0,1]. An event is refinable if it is not 
atomic. 

In the case of preparation-events the notion of refine- 
ment gives rise to the definitions of pure and mixed 
states: 

Definition 23 (Pure vs mixed states) An atomic 
preparation- event p G 6(A) is called pure state. A 
refinable preparation- event is called mixed state. 

Clearly, in a convex theory a state p is pure if and only if 
it is an extreme point of the convex set 6(A). Moreover, 
in a convex theory the refinement set Dp is a convex sub- 
set of the state space. For example, in quantum theory 
the refinement set of a density matrix p is the set of all 
(unnormalized) density matrices a such that cr < p, and 
is clearly convex. Note that the condition a < p implies 
that the support of a is contained in the support of p. 
In fact, any density matrix a with Supp (cr) C Supp(p), 
is proportional to a matrix in Dp. In particular, if the 
support of p is the whole Hilbert space (that is, if p is 
a full-rank matrix), then any density matrix is propor- 
tional to a matrix in Dp. In this case Dp is a spanning 
set for the set of all hermitian operators. The analogue of 
a full rank density matrix in the general context is given 
by the notion of internal state: 

Definition 24 (Internal state) A state oj G 6(A) is 
internal if its refinements span the whole state space, i.e. 
z/Span(i?^) = 6r(A). 

In the probabilistic theories considered in this paper 
every preparation-test {pijigx for system A admits an 
ultimate refinement {ipj}j^Y, such that each state ipj is 
pure. Using the states-transformations isomorphism we 
will also prove in Sect. ^ that in a theory with pu- 
rification this property is enough to imply that every 
test {"^ijigx from A to B admits an ultimate refinement 
{^jljgY, such that each event is atomic. 



I. Discrimination and distance 

By making tests one can try to discriminate between 
different devices. For example, imagine that we have a 
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black box preparing one of the two deterministic states 
Po, Pi € &{A), and that we want to find out which one. 
To discriminate between the two states we can perform 
a binary observation-test {ao,ai}. The probabilities of 
outcomes are then given by 



(26) 



Assuming prior probabilities ttojTTi for the states po,Pi, 
respectively, we can try to maximize the (average) prob- 
ability of correct discrimination, defined as Psucc 
ttq p{0\0) + TTi p(l|l). Substituting the expression for 
the probabilities given in Eq. ( ^6|) and using the fact 
probabilities sum up to unit, we obtain 



Psucc = ttq + (ai I TTipi - 7ro/3o)A 
= TTl + (ao| TTopo - 7ri/9l)A, 

and, optimizing over all binary tests. 



(27) 



Psucc ^0 



sup (ailTTipi - 7ropo)A 
aiee(A) 

= TTl + sup (aolTTopo - 7ripi)A. 

aoeC(A) 

Summing the two expressions above we finally get 

(opt) ^ 1 + hipi - ttopoIa 

Psucc 2 

where II ■ |a is the operational norm defined by 



(28) 



(29) 



||<S||a== sup (ai|(5)A- inf (ao|5)A See^iA). 

aie(£(A) Qoe«(A) 

(30) 

Note that the norm ||7ripi — ttoPoIa ranges between 
(when the two states and the prior probabilities are 
equal) and 1 (when the two states are perfectly dis- 
criminable). For real numbers x £ 6r(I) = K one has 

INli = N- 

Remark (operational norm in quantum theory) . 

In quantum theory the operational norm is the usual 
trace-norm \\ ■ ||i: Indeed, if we denote by and (5_ 
the positive and negative part of the hermitian oper- 
ator S = TTipi — TTopo, respectively, we obtain \\S\\a = 
Tr[^+]-Tr[5_] = ||<5||i. 

In addition to the defining properties of a norm, the 
operational norm has a simple monotonicity property: 

Lemma 1 (Monotonicity of the operational norm) 

If^G 'i(A, B) is a channel from A to B, then for every 
S S ©k(A) one has 



V^SWb < \\S\\a. 
If^ is reversible one has the equality. 



(31) 



Proof. By definition, ||'^(5||b = sup(,^gg(B) (^iIb '^^ I'^)a ~ 
infb^ge(B) (6o|b'^|(5)a- Since (^iIb*^ and (^oIb*^ 
are effects on system A, one has ||^(5||b £ 



suPaiee(A) ("iIb I^)a - infaoes(A) (aolAl^)A = ||^||a- 
Clearly, if is reversible one has the converse bound 
II (5 II A = II "^"^ ^5 II a < ir^f^lB, thus proving the equality 
||,5||a = ||'^<5||b." 

For a generic state p G 6(A), Ec^. ( |30| ) reduces to 



IpIa = sup ' (e|p), 
eee(A) 



(32) 



where sup' denotes the supremum restricted to the set 
of deterministic effects. We can now give the notion of 
normalized states: 

Definition 25 (Normalized states) A state p G 
(3(A) is normalized if \\p\\a = 1- We will denote the 
set of normalized states by ©i(A). 



Clearly, if p is deterministic, then Eq. (|3^) implies that 
it is normalized (since p corresponds to a single-outcome 
preparation-test and e to a single-outcome observation- 
test, the probability of the only possible outcome, given 
by (e|p)A, must be unit) 



In Sect. [II 



we will consider 
causal theories, where the deterministic effect e £ ^(A) 
is unique, and, therefore one has ||p||a = (e|p)A- In this 
context one also has the converse: if a state is normalized, 
then it is deterministic. 

Definition 26 (Distinguishable states, discrimi- 
nating tests) The states {pijigx are perfectly distin- 
guishable if there is a test {a^ligx such that 

iaj\pi)=\\pi\\A Sij. (33) 

The test {ai}i^x is called discriminating test. 

Remark (Distinguishable states and discrimi- 
nating test in quantum theory). In quantum theory 
a set of distinguishable states is a set of density 

matrices with orthogonal support. An example of dis- 
criminating test for this set is the collection of orthogonal 
projectors {P^jf^i, where Pi is the projector on the sup- 
port of Pi for all i < n, while Pn = I — 'Y^^Zi Pi- Clearly, 
the maximum number of distinguishable states available 
for a certain system is the dimension d of the correspond- 
ing Hilbert space. In this case, the distinguishable states 
are rank-one projectors on an orthonormal basis, and the 
corresponding discriminating test is the projective mea- 
surement on the same basis. 

If we want a theory that can describe the exchange of 
classical messages, we need at least two states po and pi 
that are deterministic and perfectly distinguishable. In 
this case, a sender can encode a classical bit 6 = 0,1 in 
these two states and a receiver can decode perfectly the 
message by using the binary discriminating test {oq, oi}. 
Indeed, one has p{j\i) = Clearly, using this encod- 
ing for any bit in a string allows perfect deterministic 
decoding of the whole string. 

We conclude this Subsection with a simple Lemma that 
will be useful in the discussion of the general no-cloning 
theorem for probabilistic theories (see Theorem n3): 
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Lemma 2 In any convex theory, if two deterministic 
states Pq,Pi € 6(A) are distinct (i.e. pq ^ pi), then 
there exists a binary test {ao,ai} such that 



p(l|0) =p(0|l) < 



1 



(34) 



Proof. Since the states are distinct there exists at least 
an effect a such that (a| po) > (a| pi). Moreover, since the 
theory is convex we can choose without loss of generality 
(q-Ipi) > 1/2 (if a docs not meet this condition, wc can 
replace it with the convex combination a' — l/2(a + e)). 
Now define the binary test {ao,ai} by the convex com- 
bination 



ao = qa+ (1 - q)0 
ai = e — flo 



1 



(a|po)+ (a|pi) 



(35) 



where is the null effect, defined by (0|p)a = 0,V/9 G 
e(A). For this test one has p(l|0) p(0|l) = 
(a|pi)/[(a|po)+(a|pi)] < 1/2. ■ 

The above Lemma states that if two states are dif- 
ferent, then the worst-case error probability, defined as 
Pwc '■= max{p(l|0),p(0|l)}, can be reduced to a value 
that is strictly smaller than 1/2. In other words, if two 
states are different, then in the worst-case scenario we 
can always distinguish between them better than with a 
random guess. 



J. Closure 

The closure of ©(A) with respect to the operational 
norm contains all the elements of 6r(A) that can be 
approximated arbitrarily well by physical states: a vector 
p e 6k (A) is in the closure if there is a sequence of states 
{pn} such that lim„_j.oo \\p — Pn\\A ~ 0. Since 6k(A) is 
finite dimensional, it is natural to assume that all such 
vectors correspond to physical states. We will make this 
assumption in the paper. In particular, assuming that 
the set 6(1) of states of the trivial system is closed with 
respect to the operational norm means assuming that the 
probabilities appearing in the theory form a closed subset 
of the interval [0, 1]. In fact, we have the following: 

Lemma 3 // an operational-probabilistic theory is not 
deterministic, then 6(1) is dense in the interval [0, 1]. 

Proof. If the theory is not deterministic there is a binary 
test giving outcomes 0, 1 with probabilities qo,qi 7^ 0, re- 
spectively. Now, this test provides a biased coin, which 
can be tossed many times, thus allowing for the approx- 
imation of any coin with bias p G [0, 1] ]30|. H 

Therefore, if we assume that the set of states 6(1) is 
closed, then the previous Lemma implies the following: 

Corollary 1 //6(I) is closed, then it is the whole inter- 
val [0,1]. 

In Subscct. [II D we will discuss the relation between 



III. CAUSAL THEORIES 

In this Section we restrict our attention to causal the- 
ories, in which the probability of outcomes of an exper- 
iment at a given time does not depend on the choice of 
experiments performed at later times. 



A. Definition and main properties 

Although in the circuits discussed until now we had 
sequences of tests, such sequences were not necessarily 
causal sequences. The input-output arrow determined 
by the connections of physical devices was not necessar- 
ily the causal arrow defined a signalling structure. In 
fact, one can formulate operational-probabilistic theories 
even in the absence of a pre-defined causal arrow, and 
this is a crucial point to formulate a quantum theory 
of gravity (see e.g. Hardy in Ref. I^^). A concrete 
example of non-causal theory is the theory studied in 
Refs. where the states are quantum operations, 

and the transformations are "supermaps" transforming 
quantum operations into quantum operations. In this 
case, transforming a "state" means inserting the corre- 
sponding quantum operation in a larger circuit, and the 
sequence of two such transformations is not a causal se- 
quence. However, the analysis of non-causal theories is 
not the scope of the present work. We now give the con- 
dition that allows us to interpret sequential composition 
as a causal cascade: 

Definition 27 (Causal theories) A theory is causal if 
for every preparation-test {pi}iex md every observation- 
test {ojljgY on system A the marginal probability 
Pi := J2j£Y i'^jl Pi)-^ independent of the choice of 
the observation-test {ajjjgy- Precisely, if {aj}jgY BL^d 
{^fclfcez £fe t'wo different observation-tests, then one has 



{a.j\pi)A = 'Y{bk\Pi)A- (36) 

feez 



closure and convexity in the context of causal theories. 



Loosely speaking, we may say that the condition of 
Eq. ( |36| ) expresses the principle of "no-signaling from 
the future". 

Causal theories have a simple characterization: 

Lemma 4 (Characterization of causal theories) A 

theory is causal if and only if for every system A there 
is a unique deterministic effect (e|^. 

Proof. Suppose that e and e' are two deterministic ef- 
fects for system A. Since deterministic effects belong to 
single-outcome tests, Eq. ( ^ gives {e\pi)A = (e'|pi)A 
for every state pi. Therefore, e = e'. Conversely, sup- 
pose that the deterministic effect is unique and take an 
observation-test {aj}j^Y on system A. Then by coarse- 
graining one obtains a single-outcome test, with deter- 
ministic effect (e'l^ = X]jeY('^ilA' ^'^d, by uniqueness 
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of the deterministic effect, (e|^ = (e'|^ = J2jeY (^3\a- 
Therefore, for every state pi we have J2j£Y ("^ilpOA ~ 
(e|/9i)A, independently of the choice of the observation- 
test {fljljgY- This proves Eq. (|3^). ■ 

Remark (quantum theory as an example of 
causal theory) Ordinary quantum theory is an example 
of causal theory. Indeed, there is a unique deterministic 
effect, corresponding to the (trace with the) identity op- 
erator on the system's Hilbert space. In other words, the 
only operator P satisfying the equation TrA[^'p] = 1 for 
every density matrix is P = /a, the identity on A. 

An immediate consequence of causality is that the de- 
terministic effect of a composite system AB is the product 
of the deterministic effects of A and B, as expressed by 
the following 

Corollary 2 (Factorization of the deterministic ef- 
fect on product systems) Let A and B be two arbitrary 
systems. In a causal theory one has 



lAB 



(37) 



Proof. Since the parallel composition of two single- 
outcome tests is a single-outcome test, the effect (e|^ (ejg 
is deterministic, according to Def. ^. Since the deter- 
ministic effect (e|^g is unique, one must have (e|^ (e|g = 

(e|AB- ■ 

Note that in a causal theory there is a unique way of 
defining marginal states: 

Definition 28 (Marginal state) The marginal state 
of |o')ab on system A is the state \p)a ■= (ejg |cr)AB- 

In a causal theory the channels (deterministic transfor- 
mations corresponding to single-outcome tests) arc char- 
acterized as follows: 

Lemma 5 (Characterization of channels) In a 

causal theory a transformation G T(A, B) is a 
channel (Def. ) if and only if (ejg'if = (^Ia- 

Diagrammatically, 



(38) 



In particular, a .state p £ S(B) is deterministic if and 
only if {e\p)B = 1- 

Proof. If is a channel, then (elg*^ is a determinis- 
tic effect. By uniqueness of the deterministic effect, Eq. 
( ^ holds. Conversely, suppose that {%}iex is a test 
from A to B and = is a transformation such that 
Eq. ( |38| ) holds. By coarse-graining, we can define the 
channel := X^iGX^*- Since is a channel, we must 



have (e|^ = (e 



(gIa + (gIb iJ2^^^o '^0, wheuCC 



(e|B iE^^^o = 0- But this implies = 0, and, 

therefore, = . Hence, is a channel. Finally, a 
deterministic state is nothing but a channel with trivial 
input system A = I. Since the deterministic effect of the 



trivial system I is the number 1, the normalization of Eq. 
( p8|) becomes (e|p)B = 1. ■ 

Lemma ^ also leads to the following 

Corollary 3 (Normalization of tests) A test 
{'''oi}iex from A to B satisfies the normalization 
condition 



iGX 



= (e| 



A • 



(39) 



In particular, an observation-test {a^jigx on system A 
must satisfy the normalization condition 



iGX 



2lA 



(e|, 



(40) 



In quantum theory, the normalization condition of Eq. 
( p8| ) means that any quantum channel must be trace- 
preserving (identity preserving in the Heisenberg pic- 
ture). Indeed, the deterministic effect is the identity 
operator, and Eq. ( |38[ ) implies that, for every quan- 
tum state p, one has TrB['^(p)] = TrA[p]. The normal- 
ization condition for observation-tests given in Eq. 1^ 
is instead the normalization of quantum measurements: 
a quantum measurement is a POVM, that is a collec- 
tion of positive operators {A^jigx satisfying the condi- 
tion X^iGX -^i = -^A, where I a is the identity operator on 
the system's Hilbert space. 

Moreover, in a causal theory we have a simple charac- 
terization of the normalized states: 

Corollary 4 (Characterization of normalized states) 

Let p be a .state of system A. In a causal theory the 
following are equivalent 

1. p is normalized 

2. {e\p)A = l 

3. p is deterministic. 

Proof. Since there is a unique deterministic effect, 
the expression of the norm given in Eq. ( |3^ ) yields 
1/9 II A = {e\p)A. This proves the equivalence 1 <^ 2. The 
equivalence 2 3 was already proved in Lemma |^. I 

For every state |p)a we can consider the normalized 
state 



\-p)i 



(e|p)i 



(41) 



Operationally, this means that we can always make 
reseated preparations: we can perform a preparation- 
test {pijigx, and, if the test gives outcome we can 
claim that we prepared the normalized state . In other 
words, in a causal theory any preparation-event can be 
promoted to a single-outcome preparation-test. Follow- 
ing this observation, in a causal theory there is no rea- 
son to forbid that every normalized state can be actually 
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produced in some single-outcome test. This imphes that 
every state is proportional to a deterministic one. In the 
following we will always assume this fact as a property 
of causal theories. 

Note that also the converse is true: 

Lemma 6 (Causality is necessary for rescaled 
preparations) A theory where every state is propor- 
tional to a deterministic one is causal. 

Proof. Let |p)a be an arbitrary state and e and e' be 
two deterministic effects. By hypothesis, we have \p)a = 
fc |p)a, where p is deterministic. This implies {e\p)A = 
k = (e'l p), and, since p is arbitrary e = e'. By lemma ^, 
this implies that the theory is causal. ■ 

Remarkably, the causal principle of "no-signalling from 
the future" implies the impossibility of signalling in space 
without exchange of physical systems: 

Theorem 1 (No-signalling without exchange of 
physical systems) In a causal theory it is impossible 
to have signalling without exchanging systems. 

Proof. Suppose that two distant parties Alice and Bob 
share a bipartite state |vI')ab, and that Alice (Bob) per- 
forms a local test {M}iex ({•^iljev) on the system 
at her (his) disposal. Let us define the joint proba- 
bility pij :— (e|^g {x/i (g) ^j) |^')ab and its marginal 

Pi*^ ■= J2jPv {pf^ ■= Y^iPij) on Alice's (Bob's) side. 
It is immediate to verify that the marginal on Al- 
ice's side does not depend on the test {-SSj} on Bob's side: 
indeed, one has 



(e|A M ® 



ie\j^£/, \p)a, 



I*) 



AB 



(42) 



having used the normalization condition J2j (sIb-^j ~ 
(ejg (Corollary H), and having defined the marginal state 
|p)a := (ele I*)ab- The same reasoning holds for the 
marginal on Bob's side. ■ 



B. Conditioning 

In a causal sequence the choice of a device can depend 
on the outcomes of previous devices. This gives rise to the 
notion of conditioned test., which generalizes the notion 
of sequential composition: 

Definition 29 (Conditioned test) If{'^i}iex is a test 
from A to B and, for every i , I > is a test from 
B to C, then the conditioned test is a test from A to 



C, with outcomes {i,ji) G Z := IJiexi*} ^ '^'^'^ events 
{^j*^ o'i'j}(j_j-.-|g2- Diagrammatically, the events ^^^^^ o'^f^ 
are represented as follows 



c 




A 




C 













(43) 



The above definition of conditioning makes sense in a 
causal theory, where the uniqueness of the deterministic 
effect ensures that the test {^j*^ o ^ijieXjiGY; satisfies 
the normalization condition required by Corollary ^: 

E(^lB'^^' = (e|A- (44) 



leXjiGYi 



iex 



Conditioning expresses the possibility of choosing what 
to do at a certain step using the classical information 
generated in the previous steps. In a causal operational 
theory there is no reason to forbid an experimenter to 
perform conditioned tests. Accordingly, in the following 
we will assume that in a causal theory any conditioned 
test is allowed. In fact, the possibility to perform condi- 
tioned tests is essentially equivalent to causality. Indeed, 
one has also the converse statement: 

Lemma 7 (Causality is necessary for conditioned 
tests) A theory where every conditioned test is possible 
is causal. 

Proof. To prove that the theory is causal we show 
that for every system A the deterministic effect (e|^ 
is unique. Suppose that (e|^ and (e'|^ arc two de- 
terministic effects, and let p £ ©(A) be an arbitrary 
state. By definition, there is a preparation-test {pi}i^x 
that contains p, that is, p = pig for some outcome 
io S X. Moreover, using coarse-graining we obtain the 
two-outcome preparation-test {poiPi}, where po = p 
and pi :~ X^i^io Pi- Now, consider the conditioned test 
{(e| Po)a, (e'l Pi)a}, defined by the following procedure: 
first perform the preparation-test {po,pi}, and then, if 
the outcome is apply the effect {e\j^, otherwise apply 
(e'l^. Since {(e|po)A, (e'|pi)A} is a test from the trivial 
system to itself one must have 



(e|po)A + (e'|pi)A = 1 



(45) 



On the other hand, since the effect e' is deterministic, one 
must have (e'|po)A + (e'|pi)A = 1- By comparison, this 
implies (e|po)A = (e'|po)A, arid, since po was a generic 
state, e = e'. ■ 

Remark (conditioning with different outputs 
and "direct sum" systems). In principle, one could 
also consider a conditioning where the output system of 

each test is a system d that depends on the 

outcome i. In this case the output of the conditioned 
test would be a "direct sum" system "C := ©jgx^*"- 
In quantum theory, this situation can be described in- 
troducing a superselection rule, according to which the 
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possible states of the "direct sum" system are the block- 
diagonal density matrices of the form p = ® Pi^ '^here 
each Pi is a density matrix on the Hilbert space associ- 
ated to system C^. This kind of extension would also 
require treating the outcome spaces X as a classical sys- 
tems that can be the input or the output of some classical 
information-processing device. However, we will not con- 
sider here this generalization as it is not needed for the 
main purpose of the paper. 

A particular case of conditioning is randomization: 

Definition 30 (Randomization) // {pijigx is a 
preparation-test for the trivial system and, for every 



outcome i, {"^j/jjiGY 

3t 

B with events defined by 



is a test from A to B, the 
randomized test {pi'^j'^ }i^x,jieYi is the test from A to 



Pi 





A 




A 


^) 


B 







(46) 



Pi 



( on the left-hand side we used the fact that that the com- 
position with trivial systems is trivial, and, therefore, one 
has AI = A,BI = B). 

If a causal theory is not deterministic (i.e. if the possi- 
ble values of probabilities are not only and 1) then ran- 
domization and coarse-graining always allows one to con- 
struct an internal state (see Def . |24|) : it is enough to take 
a spanning set of states {pi}i^x, to randomize them with 
some non-zero probabilities {pi}i£x, and then to coarse- 
grain, thus getting the internal state lo ~ ^i(=xPiPi- 

Finally, conditioning allows one to prove that a causal 
theory contains all possible measure- and- prepare chan- 
nels, defined as follows 

Definition 31 (Measure-and-prepare channels) A 

channel S 'X(A, B) is mcasure-and-prepare if there 
exists an observation-test {flijigx on A, and a collection 
of normalized states {/3i}iex C ©i(B) such that 



(47) 



iGX 



C. Distance between transformations 

Here we introduce a norm for transformations that has 
a direct operational interpretation: it quantifies the max- 
imum probability of success in the discrimination of two 
channels in a causal theory. Suppose that we are given 
two channels 'rfo,'^i G T(A,B) with prior probabilities 
ttq , TTi , respectively. In a causal theory, the most gen- 
eral way to discriminate is to prepare a bipartite input 
state p £ 6i(AC), to apply the unknown channel, and to 
perform a binary test that distinguishes between the two 
possible output states 'Toq \p)ac and '^S'l \p)ac- Optimiz- 
ing over all binary tests and using Eq. (|2^) we obtain the 



success probability psucc = 1/2(1 + Ktti'^i - tto't^o) p\\bc) 
Moreover, optimizing the input state and the extension 
we find the maximum probability of success 



P. 



opt 



1(1 + hi'^i - no%\\A,B) (48) 



where the operational norm for transformations is defined 
by 



||A||a,b = sup sup ||Ap||bc 

C pG6i(AC) 



A€Tk(A,B). (49) 



In quantum theory our expression for the operational 
norm reduces to the diamond norm in Schrodinger pic- 
ture iQ, or equivalently, to the completely bounded 
(CB) norm in Heisenberg picture jsst . 

In the case of trivial input system A = I, Eq. ( |49| ) 
gives back the norm of states introduced in Eq. (^). In 
the case of trivial output system B = I, it provides an 
operational norm for effects, given by 

||<5||a,i = sup sup \\Sp\\c <5e£R(A). (50) 

C pG6i(AC) 

In fact, the extension with the ancillary system C is not 
needed in this case: 

Lemma 8 The operational norm of an element of the 
vector space (5 £ £h (A) spanned by the effects for system 
A is given by the expression 



\\S\\a,i= sup \iS\p)A\ 
pGei(A) 



(51) 



yields the lower bound 

\{S\p)aI 



Proof. Taking C = I in Eq. (|5C 

II -^Ia,! II (^Ip)aIIi 

where we used the fact that the norm of a real num- 
ber a; S M = 6r(I) is given by its modulus: ||a:;||i = |a;|. 
To prove the equality of Eq. ( ^ ) we now prove converse 
bound. By the definition of the operational norm for 
states in Eq. (|30|), for every a G Si (AC) we have 

(<5|a (cilc |cr)AC - inf , (-^Ia (coIc |f^)AC 



II ,5(7 lie = sup 

ciGe(A) 

sup ((5|a (ci - Cole W)ag, 

\ 



{co,ci} 



(52) 



where the optimization in the last equation is over all 
possible binary tests {co,ci} for system C. Now, ap- 
plying the observation-test {cq, ci} to the bipartite state 
|o')ac we obtain a preparation-test {po,Pi} for system 
A. defined by \pi)A = (cilcl^)AC,j = 0,1. Defining 
the probabilities pi = {e\pi)A and the normalized states 
Pi = Pi/ (e| Pi)a we then have 

((5|a (ci - Cole I'^)ac = Pi {S\ Pi)a ~ Po {S\ Po)a 

< max{(5|pi)A, - (5|po)a} (53) 

< sup \{6\p)a\- 

pG6i(A) 
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In quantum theory the norm ||-D|1a,i of an hermitian 
operator on the Hilbert space of system A coincides with 
the operator norm \\D\\oc = supp>o Tr[p]=i |Tr[r'p]| = 
maxj{|di|}, where {di} are the eigenvalues of D. 

Wc conclude by mentioning a monotonicity property 
of the operational norm of transformations: 

Lemma 9 (Monotonicity of the operational norm 
for transformations) //"^ G T(A, B) and S g T(C, D) 
are two channels, then for every A G Tr(B, C) one has 

IKA-^^IIa^d < ||A||b,c. (54) 
// and S' are reversible one has the equality. 

Proof. Let R be an ancillary system, and p S 
Si(AR) be a normalized state of AR. Then, since 
o')br = '^^\p)ak is a normalized state of BR, we 
have ||(?A'^||a,d = sup^ supp^g^jAR) I^A'^pIdr < 
supfj sup^gQ^jBp-) ||(fAo-||DR. Now, using Lemma |l| we 
obtain ||(rAcr||DR < ||Acr||cR. Hence, ||<fA'^||A,D < 
supRSup^ge^(Bp.) ||A(t||cr = ||A||b,c- Clearly, if and 
S are reversible, one has the converse bound ||A||b,c = 
||^-i((?A'r)'^-i||B,c < |KA<r||A,D, thus proving the 
equality. ■ 



D. Closure and convexity in causal theories 



In Subscct. [I J we saw that if a theory is not determin- 



istic, then one can construct a circuit that simulates (with 
arbitrary precision) a coin with arbitrary bias p & [0, 1]. 

In causal theories the possibility of conditioning gives 
directly the following: 

Lemma 10 (Approximation of convex combina- 
tions) // a causal theory is not deterministic, then any 
convex combination of states, effects, and transforma- 
tions can be approximated with arbitrary precision. 



In this paper for simplicity we will always work with 
closed sets of states. Our attention will be devoted to 
non-deterministic causal theories, and, therefore, by the 
previous Corollary ^ closure implies convexity. Note 
that, however, most results hold independently of the 
assumption of convexity, since in the context of non- 
deterministic causal theories any desired combination can 
be approximated with arbitrary precision. 



E. No-restriction hypothesis in causal theories 

In a causal theory the no-restriction hypothesis of Def. 
implies that for every system A the cone generated by 
the effects coincides with the dual of the cone generated 
by the states: 



Lemma W In a causal theory the no-restriction hypoth 
esis of Def. 16 implies the condition €+(A) 
every system A. 



&*+{A)for 



Proof. Suppose that a is an element of (3!j_(A) and let 
II all A. I be the operational norm of a, as defined in Eq. 
(pl|). If ||a||A.i = , then a is the null effect, which is 
trivially an element of £+(A). If ||a||A.i 7^ 0, then de- 
fine the normalized effect oq ~ a/||a||A,i. Upon defining 
ai = e — ao, we now have (ai| p) > for all p G ©+(A), i.e. 
ai G ©+(A). Moreover, (ao|p)A + (ai|p)A = {e\p)A = 1 
for every normalized state p G (5i(A). Hence, {ao,ai} 
is a probability rule. By the no-restriction hypothesis, 
we then have that {ao,ai} is an observation-test, and, 
therefore, oq and oi are effects. This proves that ev- 
ery a G ©^(A) is proportional to an effect oq, that is, 
®+(A) C €-(-(A). On the other hand, all effects are 
positive functional on states, and, therefore &*^{A) D 
£+(A). ■ 

The above condition will be useful when discussing the 
impli cations of the n o-restriction hypothesis in subsec- 
tions IVHDl and pCCl 



Proof. Let p G [0, 1] be an arbitrary probability and 
Pn G S(I) be such that \p — pn\ < ^/n- (such a probability 
exists because 6(1) is dense in the interval [0, 1], as stated 
by Lemma H). Consider two arbitrary tests {"^ijigx and 
{!^j}j^Y from A to B. By randomization, we get the test 
{Pn'^ijiex U {(1 - Pn)&j}jeY- Then, by coarse-graining 
we can obtain the convex combination Pn'^i -\-{l—pn)&j- 
The distance with the desired convex combination p'^i -\- 
(1 -p)S!j is bounded by (||'^»||a,b + I^Ia.b)/" < 2/ri. 
■ 

As a simple consequence we have the following 

Corollary 5 (Closure implies convexity) // a causal 
theory is not deterministic and the set of states of the 
trivial system is closed, then all sets of states, effects, 
and transformations are convex. 



IV. LOCAL DISCRIMINABILITY 

Here we discuss the property of local discriminability, 
which expresses the possibility of distinguishing multi- 
partite states using only local devices. 



A. Definition and main properties 

A common assumption in the literature on probabilis- 
tic theories is what we will call here local discriminability 
(see e.g. Rcfs. ||^,0,|l|). 

Definition 32 (Local discriminability) A theory en- 
joys local discriminability if whenever two states p,cr £ 
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(3(AB) are distinct, there are two local effects a G €(A) 
and b € 2(B) such that 



-Co) 



-Co) 



(55) 



Note that local discriminability on bipartite states im- 
plies local discriminability on multipartite states, as can 
be seen by simple iteration. 

The meaning of the local discriminability condition is 
that if two bipartite states are different, then there is a 
chance of distinguishing between them by using only local 
devices. Of course, the resulting discrimination may not 
be optimal, but at least it is strictly better than the ran- 
dom guess. Indeed, in the next Lemma we show that in a 
convex theory with local discriminability two parties Al- 
ice and Bob, holding systems A and B, respectively, can 
always find a discrimination protocol that uses only lo- 
cal operations and classical communication (LOCC) and 
outperforms the random guess. 

Lemma 12 (LOCC discrimination) In a con- 
vex theory with local discriminability, if two states 
Po,Pi G ©i(AB) are distinct, then there exists 
a LOCC discrimination protocol, described by a 
binary test {Ao,Ai}, such that the probability 
Pwc ■= max{p(0|l),p(l|0)}, p{i\j) = {Ai\pj)AB is 
strictly smaller than 1/2. 



Proof. If p 7^ (T, then by local discriminability there 
are always two effects a,b such that {a^b\p)AB > 
(a(E)b\a)AB- The binary test {A, cab — A} defined by 
A := a (g) 6 can be obtained by performing the local 
tests {a, ca — a} and {b, cb — b} and taking a coarse- 
graining. If the theory is convex, exploiting the construc- 
tion of Lemma (which only requires randomization and 
coarse- graining) we obtain a binary test {Ao,Ai} satis- 
fying p(0|l) = p{l\0) < 1/2 and, therefore p„c < 1/2. 



Local discriminability is an enormous advantage in ex- 
periments. For example it allows one to perform to- 
mography of multipartite states with only local measure- 
ments. Indeed, every bipartite effect (-E|ab '^^^ t)C writ- 
ten as linear combination of product effects, and, there- 
fore every probability {E\ p)ab can be computed as a lin- 
ear combination of the probabilities (oi (g) bj\p) ab arising 
from a finite set of product effects: 

Lemma 13 (Local tomography) Let {pi} and {pj} 
be two bases for the vector spaces 6k(A) and (5k(B), 
respectively, and let {oi} and {bj} be two bases for the 
vector spaces €r(A) and (£k(B), respectively. A theory 
enjoys local discriminability if and only if every state 
a € ©(AB) (every effect E € (£(AB)j can be written 



(56) 



|ct)ab = X! |Pj)b 

i,3 

(^Iab = E ^■'^■kia(^.i 

for some suitable real matrix Aij (Bij). 



Proof. Suppose that local discriminability holds. By 
definition, the product effects a ® b are a separating 
set for ©k(AB), and, therefore, they are a spanning 
set for €r(AB). Since states and effects span vector 
spaces of equal dimension, this also implies that the 
product states are a spanning set for ©i{(AB). Con- 
versely, if Eq. (|5^) holds, then the product effects are 
a spanning set for the vector space (£m(AB). Clearly, if 
{a®b\p)AB = {a®b\a)AB for all product effects, then 
one has p = a, and this proves local discriminability. ■ 

This also implies: 

Theorem 2 (Product of internal states is internal) 

In a causal theory with local discriminability if the states 
LOA and LOB oltc internal in ©(A) and ©(B), respectively, 
then the product uja (8> wb is internal in ©(AB). 



Proof. By definition, one has Span(D„^,^„3) D 
Span(i:)<^^) (g) Span(D„3) = ©b(A) (g) ©ffi(B). Since lo- 
cal discriminability holds, this is also equal to ©r(AB). 
■ 

Moreover, local discriminability allows one to distin- 
guish two different transformations 't^,!^ € 2^(A, B) with- 
out considering their extension with an arbitrary ancilla 
system C: 

Lemma 14 // two transformations '£ ,'3) <^ 'X(A, B) are 
different and local discriminability holds, then there exist 
a state p G ©(A) such that 







B 













B 









(57) 



Proof. By definition, if ^ and ^ are different there ex- 
ist a system C and a joint state a G ©(AC) such that 
'^W)ac 7^ ^W)ac- Now, since local discriminability 
holds, there are two effects b,c on systems B,C, respec- 
tively such that 



I 


A 


Si 




C 


r 


L 



(58) 



Defining |p) {c\f^ |o')ac we then obtain (^Ib^Ip)a 7^ 
(6|b S \p)a. This implies <r \p)a + ^ \p)k- ■ 
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B. Causal theories with local discriminability 

The results of this paper can be formulated in the sim- 
plest way for causal theories that enjoy local discrim- 
inability. In this case one has the following useful prop- 
erties: 

Lemma 15 Let |cr)AB be a state of AB and \p)a 
(ejg IcrjAB, |p)b '■= (el^ |o')ab be its marginals on sys- 
tems A, B, respectively. In a causal theory with local 
discriminability one has 



Let us define r := p — ka, with k 



and normalize 



a e Span(£'p 



(59) 



where Dp(^p is the refinement set of p® p, as defined in 
Def ||. 

Proof. Take a basis {pi]^^i {{PjYj^i) of states for the 
(span of) the refinement set of p (p), and extend it to a 
basis {p,}^* ({p.lfj^i) of 6r(A) (of 6e(B)). By focal 
discriminability, we can write ct as a linear combination 
as in Eq. ( p6|) for some coefficients Aij. Now, for every 
effect (a|^ the state \pa)B '■= (oIa |o')ab is clearly in Dp. 
Therefore, we must have A^j = for all j > h. Likewise, 
applying an arbitrary effect (6|g on system B we find that 
we must have Aij = for all i > n. This implies 



AB 



EE A 



\p)t \p)j 



(60) 



that is, a € Span(Dp^p). ■ 

Since in a non-deterministic causal theory the set of 
states S(A) is convex (Corollary ^ along with the as- 
sumption that ©(I) is closed), wc also have the following: 

Theorem 3 Let \(t)ab be a state of AB and \p)a ■= 
(e|g |o')ab, \p)b (gIa I<'')ab be its marginals on sys- 
tems A, B, respectively. In a non-determinisitc causal 
theory with local discriminability there exists a non-zero 
probability k > such that 



ka e D 



(61) 



The proof of the Theorem is immediate using Lemma 
|T5| along with the following 

Lemma 16 In a non- deterministic causal theory, for ev- 
ery couple of states cr, p G (5i(A) one has 



a e Span(£'p 



ka eD 



(62) 



for some non-zero probability fc > 0. 

Proof. Take a basis {pi}"=i of states in Dp. By hypoth- 
esis, we can write a = SiPi with suitable real coef- 
ficients Si. Moreover, since we are in finite dimensions, 
there is surely a maximum coefficient Smax = niax^ Si . On 
the other hand, since pi belongs to Dp, there is surely a 
state Xi such that p = pi + Xi- This implies 



Xi)- 



(63) 



it as f := r/ (e|r)A- Using Eq. (|6^) it is easy to verify 
that f is a state, since it is a convex combination of states 
(recall that in a non-deterministic causal theory the set 
of states is convex). Moreover, we have p = ka-\-{l — k)f, 
which implies the thesis. ■ 

Remark. In the previous Lemma ^6] we used the fact 
that in a non-dctcrministic causal theory a set of states is 
convex (Corollary ^ along with the assumption that ©(I) 
is closed). In fact, we can weaken this assumption in 
the proofs of Theorem ^ and Lemma |l^. Indeed, in any 
non-deterministic causal theory we can approximate the 
convex combinations needed for the proof of Lemma |l^ 
with arbitrary precision (Lemma [l0|), thus proving Eqs. 
(HI) and (H) with a non-zero probability A; > that 
arises from a test allowed by the theory. 

Theorems ^ and |^ state two very natural properties. 
Even when discussing the extension of our results beyond 
the framework of local discriminability we will assume 
these properties to hold. 

Finally, causal theories with local discriminability en- 
joy a nice characterization of states that are invariant 
under the group of reversible transformations: 

Theorem 4 In a causal theory with local discriminabil- 
ity if systems A and B have unique invariant states 
\x)a G ©i(A) and \x)b G ©i(B), respectively, then 
\x)a \x)b G ©i(AB) is the unique locally invariant state 
of system AB . 

Proof. Suppose that |cr)AB is a locally invariant state, 
namely 





A 




A 
















B 




B 









(64) 



for all G Ga and Y G Gb. If we apply two arbitrary 
effects (a|^ and (6|b we then get 



-W - (P^^^ (65) 



having defined \pa)B ■= (oIa W)ab and \pb)A ■= 
(b\-Q |cr)AB- Now, jja and ph arc invariant (unnormalizcd) 
states. Since xa is the unique state of B that is invariant 
and normalized, one must have 



Ix)a 
Ix)b 



\pb)/ 



\pb)l 



ie\pb)A (e ®b\a) AB 

\Pa)B \Pa)B 



{e\pa)B {a®e\a) 



AB 



\Pb)A 

ib\p)B 

\Pa)B 

{a\p)A 



(66) 



p)aj Ip)b being the marginal states on systems A, B, 
respectively. Inserting the above relations in Eq. (|6^), 
we then obtain 
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(67) 



for every a, b. By local discriminability, this implies 
k)AB = |x)a|p)b = Ip)a|x)b, and, therefore, |(t)ab = 

Ix)a|x)b- ■ 



V. BEYOND LOCAL DISCRIMINABILITY AND 
CONVEXITY 

Although the results of this paper take their simplest 
form for causal theories with local discriminability, most 
of them are valid in causal theories under weaker re- 
quirements. For example, they hold for quantum theory 
on real Hilbert spaces, which is a well known example 
of theory without local discriminability. Moreover, al- 
though convexity is very well motivated in the context 
of causal theories, most results of this paper hold even 
in non-convex theories. In this Section we briefly discuss 
these generalizations. 



A. Relaxing local discriminability 

A weaker requirement than local discriminability is lo- 
cal discriminability on pure states: 

Definition 33 (Local discriminability on pure 

states) A theory enjoys local discriminability on pure 
states if whenever two states 'I'jCr G (5(AB) are differ- 
ent, and one of the two states ( say "i! ) is pure, there are 
two effects a G 2;(A) and b e £(B) such that 



-Co) 



(68) 



An example of theory with this property is quantum 
theory on real Hilbert spaces: 

Lemma 17 Quantum theory on real Hilbert spaces en- 
joys local discriminability on pure states. 

Proof. Let p = J2iPi\^i)i^i\ ^'^ a density matrix on 
the real Hilbert space (g) with = M™ and 
J^B = K" and |*) e (g) K" be a unit vector. Sup- 
pose that Tr[(p — |'I')(^'|)(a 6)] =0 for every couple 
of real matrices a and b. Taking a = \v){v\ for some 
V e we then obtain (w|a|*I'j)ab — ki,v{v\A\'^)AB for 
some constant ki^y. Likewise, taking b = \w){w\ for some 



w G R" we obtain {w\B\^i)A'B = li,w{w\B\^)AB for some 
constant li^^- Putting the two things together we have 



(69) 



{v\A{w\B\^i)AB = h,v{v\A{w\B\'^)AB 
= k,w{v\A{w\B\^)AB 

hence h^y = k^.yj := q. Finally, (w|A(w|B|<i'j)AB = 
q(i;|a(w|b|^')ab for every v, w implies = Cj|\E'), and, 
therefore cr = |*)(^|. ■ 

When generalizing our results to theories without lo- 
cal discriminability we will always assume local discrim- 
inability on pure states along with the theses of Theo- 
rems H, |[ and ^. Again, all these requirements are met 
by quantum theory on real Hilbert spaces. 

An elementary property of causal theories with local 
discriminability on pure states is that the product of two 
pure states is pure, as stated in the following Lemma. 

Lemma 18 (Product of pure states is pure) In a 

causal theory with local discriminability on pure states, 
if the states \ip)A G 6i(A) and \4')b S ©i(B) are pure, 
then their product |</j)a IVOb S 6i(AB) is pure. 

Proof. Suppose that the product can be written as a 
convex combination \ip)a\'4')b = SiGxP* I^Oab, with 
|*i)ab e 6i(AB). We now show that I^'^ab = 
\f)A |V')b for every i gX. Let (bjg be an arbitrary effect 
for system B. We then have 



J2 



Since \ip)A is pure, this implies 



-w 



(70) 



(71) 



for some coefficient X^i > 0. Clearly, for (fojg = (e|g one 
has Aei = 1. Similarly, if (a|^ is an arbitrary effect for 
system A, we obtain 



(72) 



for some coefficient Hai > satisfying fi^i = 1- Combin- 
ing the above facts, we obtain 



(73) 



Finally, this implies 

= A6,C^^^=_^_ (74) 



and, by local discriminability on pure states |'I'i)AB = 
</5)a|V')b- ■ 

Clearly, iterating the above reasoning one can 
also show that the product of N pure states 
Ivi)ai |</52)a2 • • • Wn)an is pure. 
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B. Relaxing convexity 



In this paper, if not otherwise stated, we will con- 
sider operational-probabilistic theories satisfying the 
following requirements: 

1. the theory is causal (every state is propor- 
tional to a normalized one) 

2. local discriminability holds 

3. the set of all tests is closed under coarse- 
graining and conditioning 

4. for every system, the set of states is finite- 
dimensional and closed in the operational 
norm 

5. there exist perfectly discriminable states 

6. the theory is not deterministic 



If one wants to relax convexity, it is clear from the 
proof of Lemma |l^ and Corollary |^ that one must have 
at least one of the following features: i) the theory is de- 
terministic, i.e. all events have either zero or unit proba- 
bility, ii) some randomizations or some coarse- grainings 
are forbidden, and Hi) the set of probabilities 6(1) of 
the theory is not closed. For the purposes of this pa- 
per, deterministic theories are not quite interesting, and 
theories with non-closed sets of transformations arc just 
technically cumbersome, although most of the conclu- 
sions of this paper remain unchanged. Therefore, in re- 
laxing convexity we will only consider the case in which 
some conditioned tests or some coarse-grained tests are 
forbidden. Of course, if one wants to drop a basic opera- 
tional requirement like the possibility of conditioning, one 
has to take care that some minimal properties hold. For 
example, the existence of internal states, the fact that 
every test has an ultimate refinement, and the validity 
of the theses of Theorems || and ^ have to be explicitly 
postulated. One would also need to assume that is not 
forbidden i) to attach a distinguishable state \fi)B to 
every state in a preparation-test {|pi)A}iGX; thus getting 
the new test {\pi)A \<fi)B}iGX, and ii) to perform a dis- 
criminating test {aijigx for the perfectly discriminable 
states {pi}igx, and to re-prepare state pi when the out- 
come is i, thus getting the "measurc-and-prepare" test 
{\Pt)A (a^l^ljex- 

Finally, we will show that the existence of twirling 
tests is necessary for deterministic teleportation. If one 
wants to consider non-convex theories with deterministic 
teleportation one has also to require the existence of a 
twirling-test and the thesis of Theorem ^. 



Note that the existence of perfectly discriminable 
states, needed to describe perfect classical communica- 
tion, is guaranteed in the usual convex framework, which 
contains the no-restriction hypothesis of Def. |l^. We 
recall that we don't make this assumption here. 

In most proofs the background requirement 2. can be 
always weakened to: 

2'. local discriminability of pure states and 
the theses of Theorems ^ and ^ hold 

If a particular results requires local discriminability or 
convexity this will be mentioned explicitly in its state- 
ment. 



VII. THEORIES WITH PURIFICATION 

Here we introduce the purification postulate "every 
mixed state has a purification, unique up to reversible 
transformations on the purifying system" , and we explore 
its consequences within the general framework outlined 
in the previous Sections. 



A. The purification postulate 

Definition 34 (Purification) A pure state ^ G 
Si(AB) is a purification of p E ©i(A) if \p)a = 
(ejg |^')ab- Diagrammatically, 



VI. SUMMARY OF THE FRAMEWORK 



This short Section concludes the presentation of the 
general framework used in this paper. The standing as- 
sumptions of the paper are summarized by the following 
table: 



(75) 



Definition 35 (Purifying system) // system AB con- 
tains a purification of p E 6i(A), we call system B a 
purifying system for p. 

Definition 36 (Complementary state) Let 

£ 6i(AB) be a purification of p e 6i(A). The 
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complementary state of p is the state p G (3i(B) defined 
by 



= * 



(76) 



An elementary property of purification is the following 

Lemma 19 If ip € 6i(A) is pure and 4' g 6i(AB) is a 
purification ofip, then ^ must be of the form ^ = ipiS)ip, 
with -0 G ©i(B) pure. 

Proof. Take an observation-test {fei},;gx on B. Since 
= gb wc have 



hi U—^-[K) 



iex 

(77) 

namely, the states {pi}igx defined by pi := {bi\-Q |^')ab 
form a refinement of ip- Since ip is pure, we necessarily 
have Pi = Piip for some probabilities {pi}. Precisely, we 

have Pi = (e|/9i)A = (ba <8i ^)ab = (^il'^^B, where -ip 
is the complementary state of ip. Therefore, we have 



Vi G X. (78) 



The above equation implies that ^ cannot be distin- 
guished from ^ V by any local test. Since ^' is pure, 
this implies 'i' = ip Clearly, ip has to be pure, other- 
wise we would have a non-trivial refinement of the pure 
state ■ 

It is important to stress that purification is not a phys- 
ical process: There is no physical transformation that is 
able to turn any arbitrary mixed state p into some pu- 
rification ^ of it. In quantum mechanics, this has been 
noted by Kleinman et al. in Rcf. Along the same 

lines, it is easy to prove the following general Theorem: 

Theorem 5 (No-purification of collinear states) 

Let pi,i — 1,2,3 be three distinct collinear states of 
system A— i.e. pi ^ ps and p2 = ppi + (1 - p)pz for 
some < p < 1. Suppose that |^i)AB, i = 1, 2, 3 is a pu- 
rification of \pi)A- Then for every finite number of copies 
N there is no physical transformation ^ G T(A'*^, AB) 
such that ^ \pi)®p^ = \^i)KYi for every i = 1, 2, 3. 

Proof. The proof is by contradiction. Suppose that 
such a transformation exists for some finite TV. Then, 
expanding the product p®^ = \ppx -I- (1 —p)p-if'^ ^ and 
applying the transformation we obtain 



|*2)AB='^|p2)r 

=p^<^ipi)r 



,A ^(l-p)^'^|p3)r+ke«t)AB 
V"" I*i)aB + (1 - P)"^ |*3)aB + |Pre.t)AB, 

(79) 



where prest is a suitable non-normalized state. This is 
clearly absurd, since we obtained a non-trivial convex 
decomposition of the pure state ^"2. ■ 

If ^' is a purification of p and '^^b is a reversible trans- 
formation on the purifying system, then also |5'')ab = 
'^B |^)ab is a new purification of p. Indeed, |^')ab 
must be pure, otherwise by inverting '^b on '^b |^')ab 
by linearity one would find that |5')ab is mixed. In the 
following Postulate we impose that all purifications are 
of this form: 

Postulate 1 (Purification) Every state has a purifica- 
tion, unique up to reversible transformations on the pu- 
rifying system: G 6i(AB) are two purifications 
of the same state, then they are connected by a reversible 
transformation G T(B), namely 



= * 



A 


B 




B 







(80) 



Remark (Uniqueness of the complementary 
state) Note that uniqueness of the purification assumed 
in the purification postulate is equivalent to the unique- 
ness (up to reversible transformations) of the complemen- 
tary state defined in Def. p6| . 

We now show some simple consequences of the purifi- 
cation postulate. First, it implies that all pure states of 
a system are connected by reversible transformations: 

Lemma 20 (Transitivity of the group of reversible 
transformations on the set of pure states) For any 

couple of pure states ip,ip' G ©i(A) there is a reversible 
transformation ^ G T(A) such that ip' = '^ip. 

Proof. Every system is a purifying system for the trivial 
system. Then just apply Eq. ( |80| ) with A = I. ■ 

An obvious consequence of the purification postulate 
is that in a theory with purification there are entangled 
states, according to the usual definition: 

Definition 37 (Separable states/entangled states) 

A bipartite state a G (5i(AB) is separable if it can be 
written as a convex combination of product states, that 
is, as \<j)ab = J2iPi \^i)A IV'Ob with p^ > 0,J2iPi 1- 
A bipartite state is entangled if it is not separable. 

As already anticipated, one has the following (trivial) 
Corollary: 

Corollary 6 (Existence of entangled states) // 

^'p G 61 (AB) is a purification of p E 61 (A) and p is 
mixed, then ^! p is entangled. 

Proof. By contradiction, suppose that is separable. 
Because it is pure, it must be of the form |^'p)ab = 
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\^)a \iP)b with \(p)a and |V')b pure. Then the marginal 
|p)a = (e|g |^'p)ab = |¥')a is pure, in contradiction with 
the hypothesis. ■ 

Remark (Purification and classical theories). 

Clearly, Corollary ^ shows that the purification postu- 
late rules out classical probability theory. In fact, there 
is only one possibility for a causal theory to satisfy the 
purification postulate without having entangled states: 
the theory must not contain mixed states. This neces- 
sarily implies that the theory is deterministic, that is, 
that the probabilities of outcomes in any test are either 
or 1 (if the theory were not deterministic one could 
construct mixed states by randomization). In particular, 
this also implies that in such a theory the pure states of 
an arbitrary system A are perfectly distinguishable. In 
conclusion, the only causal theories that satisfy the purifi- 
cation postulate and have no entanglement are classical 
deterministic theories. 

Another elementary consequence of the purification 
postulate is that "purity implies independence from the 
rest of the world" : 

Corollary 7 (Purity implies independence) If tp ^ 

(3i(A) is pure and p G 6i(AB) is an extension of ip, 
namely \iP)a ~ (ej^ |p)aB; then p = ip (g) a, for some 
state a G 6i(B). 

Proof. Let * G 6i(ABC) be a purification of p 
Since ^' is also a purification oi ip, by the Lemma |l8| 
we have |'I')abc = \4')a \'>i)bc, for some pure state 
?7 G (5i(BC). But since is a purification of p we have 
\p)= (e|c I*)abc = \'>P)aW)b, with |cr)B {e\^\ij)BC- 
■ 

We conclude this subsection with an important Lemma 
that extends the uniqueness of purification to the case of 
purifications with different purifying systems: 

Lemma 21 (Uniqueness of the purification up to 
channels on the purifying systems) Let 5* G (3i(AB) 
and ^I*' G (5i(AC) be two purifications of p G Si (A). 
Then there exists a channel G 'J(B, C) such that 



A 


B 




c 
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Moreover, channel ^€ has the form 
(W] 2_ 



B 




C 









-CD 



(82) 



for some pure state Lp^ G (3i(C) and some reversible 
channel G Gbc- 

Proof. Let \r])'Q and |</'o)c be an arbitrary pure state of 
B and C, respectively. Then |4'')ac |'7)b and |^')ab |vo)c 



are two purifications of p with the same purifying system 
BC. Due to Eq. (|o|), we have 



c _ 



ClO- 



(83) 



Applying the deterministic effect e on system B we obtain 
Eq. (|l|), with-r :=(e|B^|(^o). ■ 



B. Purification of preparation-tests 

We now show that the purification of normalized states 
implies the purification of preparation-tests. 

Theorem 6 (Purification of preparation-tests) 

Let {pijiex be a preparation-test for system A, and let 
^' G Si(AB) be a purification of the coarse-grained state 
p := "^i^xPi- Then there exists an observation-test 
{^i}iex on system B such that 



(84) 



for any outcome i G X. By suitably choosing the purifying 
system B, the observation-test {foijigx can be taken to be 
discriminating (Definition [^j. 

Proof. Take a set of |X| perfectly distinguishable states 
{Vi}i6X C 6i(C) for some system C. By definition of 
perfect distinguishability, there exists a discriminating 
test {ci}i£x such that 



for all i,j G X. Now consider the state 

a :=^(pj(g)^,) G ei(AC), 



(85) 



(86) 



which is clearly an extension of p, namely \p)a = 
{e\^ |cr)AC- Let ^o- G ©i(ACD) be a purification of a. 
By definition, 4' is also a purification of p. Using Eq.(^ 
we obtain for every outcome i G X 



■{cT) 



J^-[e) 



(87) 




having defined the discriminating test (fci|cD 
{ci\fj (e|j-). This proves that there exists a purification of 
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p with purifying system B := CD, and a discriminating 
test {&i}igx on B such that the thesis holds. 

Finally, if 4* € ©i(AB') is any other purification of p, 
using Lemma ^ we have 



(89) 
(90) 



where {&-}j;gx is the observation-test on B' defined by 
(6^1 ■.= {h\^. ■ 

The property stated by Theorem ^ is sometimes called 
steering in quantum theory, with a terminology that 
dates back to Schrodinger (see also Ref. Q, for a 
very recent discussion in the general probabilistic frame- 
work): one says that a bipartite state | cr) ab steers its 
marginal |p)a = (e|g |cr)AB on system A, if every convex 
decomposition |p)a — X^iexP* IpOa is induced by a suit- 
able observation-test on system B. Using the notion of 
steering, we may state the following: 

Corollary 8 (Pure bipartite states are steering for 
their marginals) In a theory with purification any pure 
state \^) £ ©i(AB) steers its marginal states \p)a = 
(ele |*)ab and |p)a = (ej^ |*)ab- 

We now present a few other corollaries of the purifica- 
tion of preparation-tests stated by Theorem ^. 

Corollary 9 Let G ©i(AB) he a purification of p. 
Then, a state a is in the refinement set Dp if and only if 
there is an effect ba G £(B) such that 



(91) 



Proof. The "if" part is trivial. Conversely, if cr is in Dp, 
by definition there exists a preparation-test {p^jigx and 
an outcome io such that pi^ = a. Using Theorem ^ and 
taking the effect feo- := bio one proves the thesis. ■ 

Corollary 10 (Bound on dimensions) Let 

^' G ©i(AB) be a purification of p E ©i(A). Then, one 
has the bound 

dim©R(B) > dimSpan(Dp). (92) 
In particular, if p is an internal state, one has 

dim©R(B)) >dim©R(A). (93) 

Proof. Consider the map uj : £k(B) ©k(A) defined 
by I— >■ |wb)A m |5')ab- By the previous corollary. 



the range of uj contains Dp. Since uj is linear, this im- 
plies dim£K(B) > dimSpan(Dp). On the other hand, 
since states and effects span dual vector spaces, one has 
dim€K(B) = dim©R(B), thus proving Eq. (H). ■ 

Theorem ^ implies the existence of pure bipartite states 
exhibiting perfect correlations in the statistics of inde- 
pendent observations: 

Corollary 11 (Pure states with perfect correla- 
tions) Let p ~ X^iGX Pi^i ^6 a mixture of perfectly dis- 
tinguishable states {ifi} C ©i(A), and let G ©i(AB) be 
a purification of p. Then there exist two observation-tests 
{flijigx and {bj}ji=x on systems A and B, respectively, 
such that 



(94) 



Proof. Consider the preparation-test {pi}i^x with 
Pi = Pi(pi. Since its coarse-grained state is p, by The- 
orem 1^ there exists an observation-test {bi} such that 
\pi)A = (^jIb I^)ab. On the other hand, the states {(fi} 
are perfectly distinguishable with a test {aijigx- Hence, 
we have {ai\j^ (6j|g |'I')ab = (oil^ \P]) = pA]- ■ 

This directly implies the following property 

Corollary 12 Let p = X^iGX PiVi ^ Si (A) be a mixture 
of perfectly distinguishable states, ^' G ©i(AB) be a pu- 
rification of p, and p = (e|^ |^')ab be the complementary 
state of p. Then, one has 



E 

i6X 



(95) 



where {<p}igx are perfectly distinguishable states o/B. 

We conclude this subsection with a crucial consequence 
of the purification of preparation-test stated by Theorem 
^, namely that if two transformations coincide on a purifi- 
cation of p, they also coincide upon input of p, according 
to the following definition: 

Definition 38 (Equality upon input of p) Two 

transformations £/, si' G T(A, B) are equal upon input 
of p, denoted by =p si' , if one has 



s/\a)A = ■b/'W)a yW)AeDp 



(96) 



In quantum theory two quantum operations s/, s/' are 
equal upon input of p if and only of one has s/ (cr) = 
s/'{a) for every density matrix a whose support is con- 
tained in the support of p. 

We then have the following: 

Theorem 7 (Equality upon input of p vs equality 
on purifications) Let ^ G ©i(AC) be a purification of 
p G ©i(A), and let s/,s/' G 'J(A,B) be two transforma- 
tions. Then one has 



AC 



I*) 



AC 



(97) 
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// local discriminability holds, one has the equivalence 
^|*)ac = ^^'|*)aC ^ £/=p£/'. (98) 

// one of the two transformations is proportional to a re- 
versible transformation the equivalence of Eq. ( jff^ holds 
under the weaker assumption of local discriminability on 
pure states. 



The existence of dynamicany faithful mixed states is a 
quite generic fact: for example, in any theory with lo- 
cal discriminability if one takes a basis {pi] C 61(A) 
for ©k(A) and a set {(p,} C 61(C) of perfectly distin- 
guishable states of some system C, than any mixture 
a = "YliiPiPi ® Vi is dynamically faithful. The remark- 
able fact in a theory with purification is that there exist 
dynamically faithful states, which, in addition, are pure. 



Proof. By definition, a state a is in the refinement set 
Dp iff there exists a preparation-test {pijiex and an out- 
come iQ such that = a. Using Corollary |9| we have 
that a is in Dp iff there exist an effect c on C such that 



= * 



(99) 



Therefore, we have that =p s^' if and only if 



(100) 



that is, if and only if the states s^/ |^')ac and i/' |'I')ac 
cannot be distinguished by local tests, that is, if and only 
if 



^ 



(101) 



for every product effect {b\^ {c\q- Clearly, if .s^ |^')ac 



|^')ac7 this condition is verified: this proves Eq. (97). 
When local discriminability holds, equality on local tests 
implies equality on global tests, hence Eq. (^8|). Fi- 
nally, if = with reversible, then the state 
|^)ac = |'I')ac is pure, and, by local discrim- 
inability of pure states, equality on local tests implies 
equality. ■ 



C. Dynamically faithful pure states 

We show now an important feature of theories with 
purification: the possibility of imprinting physical trans- 
formations into states in an injcctive way (that is, if 
two transformations differ, then the corresponding states 
are differ). This feature reduces the tomography of a 
physical process to the tomography of the corresponding 
state. Technically speaking, we call dynamically faith- 
ful any state that allows for the tomography of physical 
processes. 

Definition 39 (Dynamically faithful state) We say 

that a state a G 6 (AC) is dynamically faithful for system 
A if for any couple of transformations ^s^' G T(A, B) 
on has 



si = si' . 



(102) 



Theorem 8 (Existence of dynamically faithful 
pure states) Let uj £ 61(A) be an internal state, and 
let ^ G 6i(AC) be a purification of uj. Then 4*^^ is 
dynamically faithful for system A. 

Proof. Suppose that £i\^uj)AC = -t^' |*cj)ac- Then 
take an arbitrary system D, an internal state a G 61(D), 
and a purification of a, say ^„ G 6i(DE). Clearly, we 

have ^/|*<^)ac ® l^gJoE = I*.;)ac ® |*ct)de- Ac- 
cording to Theorem 0, this implies that s/ and s/' co- 
incide upon input oi u) ® a. Since lo and a are internal 
in 6(A) and 6(D), respectively, by Theorem || uj ® a \s 
internal in 6 (AD), that is, the refinement set Di^^^ is a 
spanning set for 6r(AD). Now, s/ and s/' coincide on a 
spanning set, and, therefore, they coincide on every state 
of 6 (AD). Since the ancillary system D is arbitrary, this 
implies s/ = s/'. ■ 

The converse of the previous Theorem || also holds: 

Theorem 9 (Characterization of dynamically 
faithful pure states) A pure state \I' G 61 (AC) is 
dynamically faithful for system A if and only if the 
marginal state \uj)a {c-lc \'^)ac is internal. 

Proof. The "if part has been just shown in Theo- 
rem 1^. To prove the "only if part, let a, a' be two dis- 
tinct effects for system A. Since ^' is dynamically faith- 
ful one has (a|^ |^')ac 7^ (^Ia I^)ac- This means that 
there exists an effect {c\q such that (a|^ {c\q |^')ac 7^ 
(a|^ (c|(. |^')ac- Defining the state |a;c)A := (c|c |4')ac, 
this implies (a|wc)A 7^ (a'|wc)A- Since Wc is in the re- 
finement set of w, such a refinement set is separating for 
€r(A). But a separating set for €r(A) must be a span- 
ning set for the dual vector space 6h(A). Hence, uj is 
internal. I 

Using this characterization it is immediate to show 
that the product of dynamically faithful pure states is 
dynamically faithful: 

Corollary 13 (Product of dynamically faithful 
states is dynamically faithful) Let '^^^'^ G 61 (AC) 
and ^f^^) G 61 (BD) be dynamically faithful for systems 
A and B, respectively. Then ^^^^ 55 41 (^) is dynamically 
faithful for the compound system AB. 

Proof. Since the product of two internal states is in- 
ternal (Theorem ||) . the thesis trivially follows from the 
previous Theorem. ■ 
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The existence of dynamically faithful pure states 
has remarkable consequences, among which the "no- 
information without disturbance" and the "no-cloning" 
Theorems, that will be analyzed in the following Subsec- 
tions. 



D. No information without disturbance 

Definition 40 (Non-disturbing tests) We say that a 
test {^}igx on system A is non-disturbing upon input 
o/pG6(A) if 



(103) 



or, equivalently, if^i^x'^^ V P an internal 

state, we say that the test is non-disturbing, because in 
this case one has 

M I'^)a = k)A Va e 6(A). (104) 

iex 

Theorem 10 (No information without disturbance) 

In a theory with purification, a test {.s^i} on system A 
is non- disturbing upon input of p, if and only if each 
transformation s^i is proportional to the identity upon 
input of p, namely ^ =p PiJ^A- 

Proof. Let ^'ab be a purification of p. By Theorem |^, 
the no-disturbance condition X^iex ~p holds if 
and only if 
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Since ^' is pure, this implies |4')ab = Pi |^)ab = 
{Pi-^h) |'I')ab- Now, since the identity is trivially a re- 
versible transformation, according to Theorem |^ this is 
equivalent to =p Pi^A- B 

Theorem 11 (No joint discrimination of a span- 
ning set of states) In a theory with purification the 
states in a spanning set cannot be perfectly discriminated 
in a single observation-test. 



Proof. By contradiction, suppose that a collection of 
states {pijigx is a spanning set — namely Span{pi}igx = 
Sr(A) — and there exists an observation-test {a^jigx 
such that {ai\pj)A = 5ij. Then, since perfectly distin- 
guishable states are linearly independent, and they must 
span a finite dimensional vector space, the number of 
perfectly distinguishable states must be finite. Now con- 
sider the measure-and-prepare test {Mjigx defined by 
■^i = \Pi)A (q^iIa- Since the states of the spanning set are 
perfectly distinguishable, the test is non-disturbing. 



Indeed, expanding an arbitrary state p on the spanning 
set; one has 

Y s^t \p)a ^Yl'^i ^ '^^'''^ \P3^^ I'')- 

* i \ j / 3 

(106) 

Since x/i ^ PiJ^A, this is in contradiction with the no- 
information without disturbance Theorem |l^. ■ 

Corollary 14 (No joint discrimination of pure 

states) In a theory with purification for every system the 
pure states cannot be perfectly discriminated in a single 
observation-test. 



Proof. Since pure states are a spanning set, they cannot 
be perfectly discriminated in a single test, according to 
Theorem (|ll|). B 

Corollary ^ provides a simple alternative way to see 
that classical probability theory is excluded by the pu- 
rification Postulate. 

Corollary 15 (Mziximum number of perfectly dis- 
tinguishable states) For every system A the maximum 
cardinality of a set of perfectly distinguishable states is 
strictly smaller than dim6R(A). 

Proof. Since perfectly distinguishable states are linearly 
independent, if one could find dim(3M(A) perfectly dis- 
tinguishable states, then they would form a spanning set, 
in contradiction with Theorem |ll|. ■ 

Note that the maximum number of distinguishable 
states in quantum theory satisfies a much stronger bound: 
such a number is given by the dimension c^a of the sys- 
tem's Hilbert space, while the dimension of the vector 
space spanned by the density matrices is dim6R(A) = 
dl. 

Corollary 16 (Non-unique convex decomposition 
on pure states) In a theory with purification satisfying 
the no-restriction hypothesis of Def. [7^ for every system 
A there is a mixed state p G 6i(A) with a non-unique 
convex decomposition on pure states. In other words, the 
convex set ©i(A) cannot be a simplex. 

Proof. By contradiction, suppose that 6i(A) is a sim- 
plex. Then the pure states {fi} of A are a finite set, 
and for each of them there is a functional Oi € £b(A) 
such that (ai \ipj) = Sij . Clearly, is positive on every 
state, namely a; G ©+(A)*. Hence, by the consequence 
of the no-restriction hypothesis stated by Lemma |ll|, we 
have Oi G £+(A). Moreover, one has (c 



'i\A = (eU- In 

Corollary 37 we will show that any such collection {oi} is 
an observation-test. But this test discriminates all pure 
states, in contradiction with Corollary |lj. This proves 
that 6i(A) cannot be a simplex. ■ 
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E. No-cloning 

Definition 41 (Cloning channels) Let A, A' be two 

operationally equivalent systems, and let {pijigx be a set 
of states of A. A channel from A to AA' is a cloning 
channel for the set {pijigx */ 



'^\Pi)A = \Pi)A \P'i)a' 



(107) 



// there is a cloning channel, we say that the states 
{/5i}igx are perfectly cloneable. 

We now show that a spanning set of states (in particu- 
lar, the set of pure states) cannot be perfectly cloned. To 
see this we use the equivalence between perfect cloning 
and perfect discrimination, which was originally proved 
in Refs. |6| ^ for causal theories with local discriminabil- 
ity using the tomographic limit. Here we use the stronger 
result of Ref . , which proves the equivalence in any 
convex theory where all "measure-and-prepare" channels 
are allowed, without requiring causality and local dis- 
criminability, and without resorting to the tomographic 
limit. For convenience of the reader, the argument of Ref. 
p9| is reproduced here using the notation of the present 
paper: 

Theorem 12 (Cloning/discrimination equiva- 
lence) In a convex theory where all "measure-and- 
prepare" channels are allowed, the deterministic states 
{pijiex C S(A) are perfectly cloneable if and only if 
they are perfectly distinguishable. 

Proof. Suppose that the states {pijjgx can be perfectly 
cloned and consider the binary discrimination between 
two states pi,pj,i ^ j with a binary observation-test 
{ai,aj}. Define the worst-case error probability as 

Pu,c ■■= i-na.x{p{i\j),p{j\i)} p{k\l) := {ak\pi)A, (108) 

and take its minimum over all binary tests 



„(opt) 

i WC 



mm p^c- 
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Now, if a cloning channel exists, we can apply it twice to 
the unknown state, thus getting three identical copies of 
it. Performing three times the optimal test, and then us- 
ing majority voting we obtain the new error probabilities 
given by 



p'm = fiP^'""Hm fi^) = ^'(3 - 2x), (110) 



whe: 



re p 



(opt) 



Aopt) 



Pj)A- Since / is a non- 



decreasing function for x € [0,1], we also have p^,^ ^ 
f (^Pwc*^^ , and, since pwT^ is the minimum error proba- 
bility, by definition p'^^ > pjf^* . The only solutions of the 
inequality f{x) > x are a; = and x G [1/2, 1], and, since 
pl^c*"* must be in the interval [0, 1/2) (see Lemma 0), we 



obtain pwc*'' = 0. This proves that any pair of states 
from the set {p^jigx can be perfectly distinguished. But 
this implies that using |X| — 1 pairwise tests we can per- 
fectly discriminate all the states {pijigx- This proves the 
implication "perfect cloning perfect discrimination" 
in any convex theory. If the theory contains all possi- 
ble "measure-and-prepare" channels, the converse is ob- 
viously true: If the states can be perfectly discriminated 
by an observation-test {oijigx, then the measure-and- 
prepare channel ^ := J2iex IA')a \Pi)a' (ojIa ^ cloning 
channel. ■ 

Since measure-and-prepare channels can be obtained 
by conditioning the choice of a preparation-test on the 
outcome of an observation-test, any causal theory satis- 
fies the hypotheses of the previous Theorem, which be- 



Corollary 17 (Cloning/discrimination equiva- 
lence in causal theories) In a causal theory the states 
{pi}iex C ©i(A) are perfectly cloneable if and only if 
they are perfectly distinguishable. 

Remark (Non-causal theories with all measure- 
and-prepare channels). Note that there are also 
non-causal theories that contain all measure-and prepare 
channels. An example can be constructed by starting 
from a causal theory 6, and by regarding the set of trans- 
formations T(A, B) from A to B as the set of "states" 
6' (A — > B) of the system "A — >■ B" in a new second- 
order theory O'. Performing an observation-test on a 
"state" ^ G ©'(A ^ B) is then interpreted in the un- 
derlying causal theory Q as applying the transformation 
'rf g T(A,B) to an input state a G 6i(AC), and sub- 
sequently performing an observation-test {6i}igx on the 
output state ("^(X) J^c) |o')ac- Of course, since the theory 
is causal, one can use conditioning and perform a chan- 
nel 'rfi that depends on the outcome i. This provides the 
realization of an arbitrary measure-and-prepare channel 
in the non-causal theory &' . 

Coming back to causal theories with purification, the 
results proved so far imply the following no-cloning state- 
ment: 

Corollary 18 (No-cloning of states in a spanning 
set) In a theory with purification, a cloning channel for 
a spanning set of states cannot exist. In particular, pure 
states cannot be cloned. 

Proof. Immediate consequence of Corollary [ij com- 
bined with Theorem |ll| and Corollary ^ ■ 



VIII. PROBABILISTIC TELEPORTATION 

A. Entanglement-swapping and teleportation 

As we previously showed, in a theory with purification 
there must be entangled states (according to the usual 
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definition, sec Def. We now show the possibihty of 
probabihstic entanglement swapping: 

Theorem 13 (Probabilistic entanglement- 
swapping) Let 'I' G (5i(AB) be a pure state, and 
let A' and B' be operationally equivalent to A and 
B, respectively. Then there exist an atomic effect 
Eqj G €(BA') (see Def. ^^and a non-zero probability 
pxj, such that 



A' 



Eq, \ =p^ \ ^ 



B' 



fill) 



(112) 



Proof. Let us define the marginal states 

Ip)a (e|Bl*)AB 
Ip)b := (e|Al*)AB 

By Theorem | we have that there exists a non-zero prob- 
ability such that p^fVl/ G Dp(g,p. Since |^')ab |^)a'B' is 
a purification of |p)a Ip)b', using corollary ^ we get the 
thesis. The effect i?* can be taken to be atomic: indeed, 
if it were refinable, i.e. Eq, = '^^^Ei, since the right hand 
side of Eq. 



(Ill) 



is a pure state, each effect Ei would 
achieve entanglement swapping. ■ 

Remark (PR boxes are excluded by the purifi- 
cation Postulate) . The possibility of probabilistic en- 
tanglement swapping shows that the purification Postu- 
late excludes the theory of Popescu-Rohrlich boxes (see 
Ref. Q for the definition of transformations on boxes 
and states of multipartite boxes). Indeed, Refs. [|[ [l0|] 
showed that probabilistic entanglement swapping is im- 
possible in this theory. 

Corollary 19 (Probabilistic teleportation) Let ^ G 

©i(AB) be a pure state, and let p G ©i(A) and p G 
©i(B) be its marginals. Let A' and B' be operationally 
equivalent to A and B, respectively. Then, there exists 
an atomic effect i?^ G £(BA') and a non-zero probability 
pqr such that 



E-qj 



--p 



A' 




A 
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and 



A' 



E, 



(114) 



In particular, if p is an internal state, one has the prob- 
abilistic teleportation scheme 



A' 




E\ji 



A' 




A 









(115) 



Proof. Just combine Theorems |l^ and ^ M 

The diagram of probabilistic teleportation ( 115 ) is one 
of the main axioms in the categorial approach by Abram- 
sky and Coecke In the present approach, this prop- 
erty is derived from the purification postulate, rather 
than being assumed from the start. 

For theories with local discriminability the probability 
of teleportation is related to the dimension of the state 
space as follows: 

Lemma 22 (Maximum teleportation probability) 

// local discriminability holds, then the probability of 



teleportation p^s, in Eq. (115) satisfies the bound 



dim6K(A) 



(116) 



Proof. Let us choose two bases {pi} and {pj} for 
the vector spaces ©r(A) and 6r(A), respectively, and 
write ^ as |^^')AA = Y^i,j ^ij IpOa \P])a- Now take the 
dual bases {p*} and {p*} for the dual vector spaces 
£r(A) and 2;r(A), respectively — so that {Pi\pj)A = Sij 
and (p^IpOa = ^ki — , and write E^s, as (-E^Iaa' = 



J2k,i^ki{pl\p^{Pi\p^,- The teleportation diagram ( lll5| ) 
is then equivalent to the matrix equality 

AB^P^Ia, (117) 

where /a is the identity matrix of size dim((3K(A)). Fi- 
nally, since probabilities are bounded by unit, we obtain 

1 > (S*|*)aa = Tr[AS] =p*dim(6R(A)), (118) 

which is the desired bound. ■ 

Remark (quantum theory achieves the bound). 

Note that in quantum theory the teleportation probabil- 
ity achie ves t he maximum value allowed by the bound 
of Eq. (116): For a d-dimensional Hilbert space, the 
real vector space spanned by all density matrices has di- 
mension d^, which is exactly the maximum probability 
of conclusive teleportation. 

A simple consequence of probabilistic teleportation is 
the possibility of remotely preparing any bipartite state 
by acting locally on the purifying system only, according 
to the following definition 

Definition 42 (Preparationally faithful state) A 

state 4* G (3i(AB) is preparationally faithful for system 
B if for every bipartite state a G (3i(AC) there are a 
transformation s^^ G T(B, C) and a non-zero probability 
Per such that 



A 
C 
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B 




C 
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Corollary 20 (Existence of preparationally faith- 
ful pure states) Let G ©i(AB) be the purification of 
an internal state uj G 6i(A). Then, 5* is preparationally 
faithful for system B. 
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Proof. Let E^j, be the tcleportation effect for as 
defined in Corollary nol Define the transformation as 




Applying to ^ and using Eq. (ffE) with A' = A we 
then obtain 





A 






B 




C 




Eq, 



(121) 



P* cr 



c 



Hence, the thesis holds with p^J, independently of 

a. m 



B. Storing and probabilistic retrieving of 
transformations 



Here we consider the task of storing an unknown trans- 
formation in the state of some system. The output state 
of such a storing protocol then becomes a "program" 
from which the transformation can be retrieved at later 
time. The task is achieved probabilistically by a machine 
that retrieves the transformation from the program and 
applies it on a new input state. 

Corollary 21 (Storing and probabilistic retrieving) 

Let \l/ G ©i(AA) be a pure dynamically faithful state 
for system A, according to Def. The storing pro- 

tocol, consisting in the application of a transformation 
'rf G T(A,B) to the input state as in the following 
diagram 



B 




A 




B 




• 


A 









(122) 



defines an injective map from transformations £ 
T(A, B) to bipartite states R G (3(BA) satisfying the 
property 



(e|B I^)ba e 



(123) 
The 



where uj is the marginal state |a))^ = (e|^ I^)aa- 
inverse map is given by the probabilistic retrieving pro- 
tocol 




E\j/ 



(124) 



where Eq, is the tcleportation effect for state and pq, is 
the corresponding tcleportation probability, as defined in 
Eq. ([71^;. 



Proof. Since the state ^ is dynamically faithful, the 
map ^ R<g is injective. Now, any transformation ^ 
is part of a test {"^ijigx, namely one has = "^^^ for 
some outcome zq. Defining the coarse-grained channel 



(120) <^x :=E.6X^^ wehave 



EWbI^^.) 

i6X 



BA 



= EWB'^n*)AA 

iex 

= (e|B'^x|*)AA 
= (e|Al*)AA 

= \Cj)~.. 



(125) 



having use the normalization condition (ejg^x = {^\a- 
This implies (ejg | J?<g )p, ^ is in the r efine ment set of w, 
thus proving Eq. (123). The identity (124) simply follows 
by writi ng p .^*^ — o (p^jiJ^a) and substituting pq,^x as 
in Eq. (pT5|). ■ 



In Section IX we will show that the correspondence 
I— ?> i?<g' is als o sur jective on the set of bipartite states 
satisfying Eq. (123). This will provide an isomorphism 



between transformations and bipartite states that enjoys 
all the structural properties of the Choi- Jamiolkowski iso- 
morphism of quantum theory ji^ . 

The probabilistic retrieving of Eq. (124) implies a 



bound on the operational distance between two trans- 
formations in terms the operational distance be- 
tween the corresponding states: 

Theorem 14 Let G 'X(A,B) he two transforma- 

tions and Rs^m Rs^x S 6 (B-'^) corresponding states 

as in Eq. (12i). Then one has the hound 



5^0 A,: 



< 



l-R-M -fi^MjIlBA 



P* 



(126) 



where p<i, is the probability of retrieving a transfo rma tion 
from the corresponding state, as defined in Eq. (124). 



Proof. Define A := jz^i — jz/q and Ra R^^/i — Rs/o- 
Take an a ncilla ry system C and a state p G 6i(AC). 
Then Eq. ([l2^ ) imphes 



1 

P* 



Ra 




E, 



(127) 



Applying a bipartite effect (ajg^ on both sides we then 
obtain 



(a|BC^I/5)AC = 



ibp\RA) 



BA 



P* 



(128) 



where {bp\^^ := [{a\^^ ® (£^*Iaa] Ip)ac- Since bp is an 
effect, the above equality implies the bound 



infb {b\ Ra)ba 



pq, 



< (a|BC^I^)AC < 



supb (&|-Ra)ba 



p* 



(129) 
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By the definition of operational norm in Eq. this 
implies 

IIApIIbc < l|i?A|lBA/P*- (130) 

Finally, taking the supremum over the ancillary system 
C we get the desired bound. ■ 



C. Systems of purifications and tlie linlc product 

For every system A we now fix a dynamically faithful 
pure state |^^^')aA' 'where A is some suitable purifying 
system. According to the characterization of dynamically 
faithful pure states given in Theorem the marginal 
state |w)a := (e|^ |^^^^)aa internal. The role 

of the upper index in vp^^) is precisely to recall that the 
marginal is internal for system A, while it may not be 
internal for the purifying system A. Moreover, we denote 
by (iS^^^I^^ and by the effect and the probability 



appearing in the teleportation scheme (115), respectively. 

Note that since the product of dynamically 
faithful pure states is dynamically faithful (Corol- 
lary Jl3), for bipartite systems AB we can choose 
I^'^'^abaS |**^')aa|*<''')bb- Likewise, we 

can choose (i^'^^'l^gAB = (^^^^ Iaa (^^^^ Ibb' ^^^d 
PAB = PaPb ■ We call a system of purifications such a 
choice of bipartite states and effects: 

Definition 43 (System of purifications) A system 
of purification is a choice of a dynamically faithful pure 
states |4'^^-')aa '^'^^ teleportation effects (-^'^^^I^^a ^^'^^ 
satisfies the properties 



(131) 



Once a system of purifications has been fixed, one can 
discuss the composition of transformations in terms of 
composition of states, generalizing the definitions and the 
results introduced by Refs. |3^, ^ in the quantum set- 
ting. 

Definition 44 (Link product) The link product of 

two vectors p S 6k(BA) and a G Sr(CB) is the vec- 
tor p* a & (5r(CA) given by 





ABAB ^ 




AA 




BB 




AI3AB 




AA 




BB 



(0 * (T 



1 

PB 
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Note that if p and a are proportional to states, then also 
p * (7 is proportional to a state: one has p* a € 6+(CA) 
for any couple p € 6+(BA),fT e 6+(CB). 

The product and composition of transformations are 
then given by the following 



Corollary 22 (Composition of states) Consid er th e 
correspondence given by the storing protocol in Eq. \l2^ . 
For two transformations ^ e T(A, B) and 3) S T(C, D) 
one has 



BC 



Reg 



AC 
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For two transformations ^ G T(A, B) and 3) G T(B, C) 
one has 



1 

PB 



Roj 



(134) 



R'g * R@ 



Proof. The first equation follows from the fact that 

I^^^^^ACAC I*^^-')aa|*^°^)cC' wbile the second 
follows from the probabilistic retrieving of Eq. (124): 



1 

PB 



R-^ 



E(^ 



Rs>o 
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IX. DILATION OF PHYSICAL PROCESSES 

In this Section we derive dilation Theorems for chan- 
nels, observation-test, and general tests. These Theorems 
extend to all theories with purification the validity of the 
theorems by Stinespring |Q, Naimark |Q, and Ozawa 
p5[ , originally obtained in the setting of operator alge- 
bras. 



A. Reversible dilation of channels 

In order to derive the reversible dilation of a channel 
we need the following lemma: 
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Lemma 23 Let R G (3i(BA) be a state such that 



R 



(136) 



where vE'(^) is a pure dynamically faithful state for sys- 
tem A. Then there exist a system C, a pure state 
ipo G ©i(BC), and a reversible channel £ T(ABC) 
such that 



BC 



(137) 



Moreover, the channel e T(A,ABC) defined by 'f := 
|<Po)bc is unique up to reversible channels on AC. 

Proof. Take a purification of _R, say G ©i(CBA) for 
some purifying system C. One has 



-CD 



^(A) 



that is, the pure states and vI>(A) have the same 
marginal on system A. Applying the uniqueness of pu- 
rification as expressed by Lemma 21 one then obtains 



BC 



-CD 



BC 



(139) 



Applying the deterministic e ffect on system C on both 
sides, one then proves Eq. (137). Mor eover, if := 
|'Po)bc is channel such that Eq. ( |137D holds, then the 
pure states |*'^^)aa ^'^^ ^' I^^^^aA ^^^^ ^he same 
marginal on system BA. Uniqueness of purification then 
implies 



$(A) 



AC 



^(A) 



AC 
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for some reversible channel W G T(AC). Since is 
dynamically faithful for A, this implies = Wy . ■ 

We now give the definitions of dilation, environment, 
and reversible dilation: 

Definition 45 (Dilation of a channel) A dilation of 
channel G T(A, B) is a channel 'Y G T(A, BE) such 
that 



A 




B 














A 
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We refer to system E as to the environment. 



Definition 46 (Reversible dilation) A dilation Y G 
T(A, BE) is called reversible if there exists a system Eg 
such that AEo ~ BE and 



A 


r 


E 


B 
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for some pure state ipQ G 6i(Eo) and some reversible 
channel G T(AEo,BE). 

According to the above definitions, we have the follow- 
ing dilation theorem: 

Theorem 15 (Reversible dilation of channels) 

Every channel G T(A, B) has a reversible dilation 
■y G T(A,BE). Ifr.r' G T(A,BE) are two reversible 
dilations of the same channel, then they are connected by 
a reversible transformation on the environment, namely 



-CD 



if 



(143) 



(138) for some reversible channel W G Gj 



Proof. Let us store the channel 'lo in the faithful state 
^f^^) G 6i(AA), thus getting the state R^, as in Eq. 
(122). Since is a channel, it satisfies the normalization 
condition 



-CD 
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which implies 

B 



R%- 



CD 
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Now, applying Lemma ^ we obtain 

(W1 ^ 



<5(A) 



AC 



(146) 



Since ^f^^) is dynamically faithful for system A, this im- 
plies 



A 




B 









BC 



AC 



CD 
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Therefore, 'Y := ^ |</'o)bc is a reversible dilation of ^ , 
with Eq := BC and E := AC. Finally, the uniqueness 
clause in Lemma ^ implies uniqueness of the dilation. ■ 

Moreover, two reversible dilations of the same channel 
with different environments arc related as follows: 
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Lemma 24 Let f G T(A, BE) and £ 1{A, BE') be 
two reversible dilations of the same channel , with gen- 
erally different environments E and E'. Then there is a 
channel from E to EE' such that 



(148) 



The channel has the form 



E 
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E' 







E' 
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for some pure state rjo G ©i(E') and some reversible 
transformation ^ S T(EE'). 

Proof. Apply f and r' to the faithful state and 
then use the uniqueness of purification stated in Lemma 

1- ■ 

The above results represent the general version — 
holding in all probabilistic theories with purification — 
of the dilation scheme implied by Stinespring's Theorem 
p3| in quantum theory. However, differently from the 
proof of Stinespring's Theorem, the present proof does 
not require any C*-algebraic structure, being based just 
on the purification postulate. In fact, it is easy to see 
that the purification of states and the reversible dilation 
of channels are equivalent features, in the following sense: 

Corollary 23 (Equivalence between purification 
and reversible dilation) Existence and uniqueness (up 
to reversible channels on the purifying system) of the pu- 
rification of states is equivalent to existence and unique- 
ness (up to reversible channels on the environment) of 
the reversible dilation of channels. 

Proof. The direction "purification => dilation" has been 
just proved by the dilation theorem. The converse is 
obvious, since a normalized state p S (5i(B) is a special 
case of channel from the trivial system I to B, and in this 
special case purification coincides with dilation. ■ 

Finally, the reversible dilation of a channel allows one 
to define the complementary channel as follows 

Definition 47 (Complementary channel) Let y G 

T(A, BE) be a reversible dilation of channel G T(A, B), 
as in Theorem |7^. The complementary channel of is 
the channel G T(A, E) defined by 



-CD 
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Note that the complementary channel is unique up to 
reversible transformations on the environment E. 

The notion of complementary channel has played a 
crucial role in the research about capacity of quantum 



information channels (see e.g. |5Cl|-|52|) and we expect 
that having the same definition in general probabilistic 
theories will be very fruitful (in fact, a number of conse- 
quences is already presented in the Section Xl). 



B. Reversible dilation of tests 

We now generalize the dilation of channels (i.e. single- 
outcome tests) to the case of arbitrary tests. For this 
purpose, we need the analogue of Lemma p3| : 

Lemma 25 Lei {i?,;}igx &e a preparation-test for system 
BA with the property 



El 



iGX 
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where ^I'(^) is the purification of an internal state of sys- 
tem A. Then, there exists a system C, a pure state 
ifio G 6i(BC), a reversible channel ^ G T(ABC), and 
an observation-test {ci}igx on C such that 



^0 



B 
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for any outcome i G X. By suitably choosing system C, 
the observation-test {cijigx can be taken to be a discrim- 
inating test. 

Proof. Take a purification of the coarse-grained state 
R = ^^Ri, say ^'fl G (5i(CBA) for some purifying sys- 
tem C. According to Theorem ||, there is an observation- 
test {cijigx on C such that 



\R. 



BA 



(cdcl^fl) 



CBA 



Vi G X, 
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and, by suitably choosing C, {ci} can be chosen to be a 
discriminating test. Following the same line of Lemma 
^ we then obtain the thesis. I 

Following the proof of the reversible dilation of chan- 
nels given in Theorem ^ we have the following 

Theorem 16 (Reversible dilation of tests) For ev- 
ery test {"^ijigx from system A to system B there exist a 
system C, a pure state ipo G 6i(BC), a reversible chan- 
nel ^ G T(ABC), and an observation-test {cijigx on C 
such that for all outcomes i G X 



-^—[T) (154) 

B 



By suitably choosing system C, the observation-test 
{ci}iex can be taken to be a discriminating test. 
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In the case we choose the observation-test {ci}igx to 
be discriminating, the above Theorem yields a (simph- 
fled) version of Ozawa's Theorem in quantum theory [|5| . 
Here the simphfication comes from the fact that we con- 
sider finite dimensional state spaces and tests with finite 
outcomes, whereas the challenging part of Ozawa's The- 
orem is the rigorous treatment of infinite dimension and 
continuous spectrum. 

Moreover, we can apply the dilation theorem to tests 
with trivial output B = I, thus obtaining the operational 
version of Naimark's Theorem |Q in the finite-outcome 
case: 

Corollary 24 (Discriminating dilation of 
observation-tests) For every observation-test {ai}i^x 
on A there exists a system C, a pure state (po G 61(C), 
a reversible channel G T(AC), and a discriminating 
test {ci}igx on C such that 



-\ID 



(W}- 
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for all outcomes i e X. 

Another corollary is the following: 

Corollary 25 (Characterization of theories with 
purification) In a theory with purification every test can 
be realized using only pure states, reversible transforma- 
tions, and discriminating tests. 

In fact, only one pure state for each system is enough, 
since due to Corollary ^ all pure states can be obtained 
from a fixed one by acting with reversible channels. 



3. in convex theory the map ^ i-> i?<^ defines a bijec- 
tive correspondence between transformations ^ € 
T(A,B) and bipartite states R G (5(BA) satisfying 
the property 



P)a = (e| 



AA- 



(157) 



Proof. Let us start from the proof of item 1. One di- 
rection is obvious: if {'^i}i£x is a test from A to B, it 
must satisfy the normalization condition X^iex (^Ib ~ 
(ej^ (see Eq. (^). The preparation-test {i?<^.}igx 
defined by \R'^i)-Qp^ = |^'"^'')aa satisfies the prop- 



BA 



E 



(ek^. |*^^^)aa 



erty Ei6X 

(e|^ |^'^^)aA' ^^^^ satisfies Eq. (156). Moreover, if 
two tests {'^i}i^x and {^/}igx satisfy R<^. = i?<^/ for all 
i S X, then by injectivity of the map 1— >■ R'^ (proved in 
Corollary ^l|), one has '^^i — for all i S X. Conversely, 
supp ose that {i?i}igx is a preparation-test satisfying Eq. 
(156). Then, by Lemma |25| there is a a system C, a pure 
state (/3o e 6i(BC), a reversible channel € T(ABC), 
and an observation-test {cijigx on C such that for every 
outcome i G X one has 



A 



(158) 



Defining the test {^€.{\.uzx by := (c^\^ (ej^^ \^q)bc, 
we then obtain 



X. STATES-TRANSFORMATIONS 
ISOMORPHISM 

The results of the previous Section allow a com- 
plete identification of transformations with bipartite 
states, thus providing the general version of the Choi- 
Jamiolkowski isomorphism [ p6| pTf in quantum theory. 
The correspondence is summarized in the following 

Theorem 17 (States-transformations isomor- 
phism) The storing map ^ R^ := |*^'^-')aA' 
where |^''^^)aa pure dynamically faithful state for 
system A, has the following properties: 

1. it defines a bijective correspondence between tests 
from A to B and preparation-tests {i?i}igx 
for BA satisfying 



BA 



(el 
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2. a transformation is atomic ( according to Defini- 
tion pi) if and only if the corresponding state R-^ 
is pure. 
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This completes the proof of item 1 . Item 2 is an imme- 
diate consequence of the item 1: If is atomic, then 
R'^ must be pure, otherwise we would have a non-trivial 
decomposition of Vice-versa, if R<^ is pure, then 
must be atomic, otherwise we would have a non-trivial 
decomposition of R<^. Regarding item 3, injectivity was 
already established in Corollary ^ To prove surjectiv- 
ity, suppose that R G (5(BA) is such that (ejg |^)ba 
in the refinement set of This means that there is 

a preparation-test {w,;}igx such that ui = Eiex'^i ^^'^ 
(e|g |-R)ba ~ pjo)A ^'^^ some outcome jq- Now choose 
an arbitrary set of normalized states {pi}iex C 61(B) 
and consider the collection of states {i?i}iex defined as 
follows: Rig ~ R, Ri ~ Pi ® uJi for i ^ i^. Because 
the theory is convex the collection of states {i?i}i(EX is 
a preparation-test (it can be obtained by randomization 
of the normalized states Ri = Rij (&\Ri)^p^ with proba- 
bilities Pi = {e\Ri)^pJ. Moreover, it clearly satisfies Eq. 
( |156| ). Therefore, using item 1 we see that there exists 
a test {"^ijigx from A to B such that Ri = R'^.. In 
particular, R = Ri^ = R'^^^ , thus proving surjectivity. ■ 
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Clearly, the correspondence H> i?-^ can be extended 
via linear combinations to an injective linear map be- 
tween the vector spaces Tr(A, B) and (5r(BA). 

An immediate consequence of the states- 
transformations isomorphism is the following 

Corollary 26 (Existence of an ultimate refinement) 

In a convex theory with purification, every test {"^^ijigx 
from A to B admits an ultimate refinement {^j}jgY 
where every transformation is atomic. 

Proof. Consider the preparation-test {i?'^.}igx and take 
the normalized states R-^^ = R^J (e| i?<g'.)g^. Since the 
states form a finite-dimensional compact convex set, each 
state R'^- has a convex decomposition on a finite num- 
ber of pure states. Collecting together all these decom- 
positions yields a preparation-test {Rj}j^Y, containing 
only pure states, that refines {i?'^;}igx- By the states- 
transformations isomorphism, one has Rj = , for 
a test {^j}j£Y that refines and contains only 

atomic transformations. I 



A. First consequences of the isomorphism 

Two simple consequences of the states-transformations 
isomorphism are the following: 

Corollary 27 A channel Y from A to AB is atomic if 
and only if it is of the form 



(160) 



for some pure state ipo G Si(B) and some reversible 
channel € Gab- 

Proof. Clearly a channel of the form Y = 
^q)b is atomic, since the corresponding state R'^ = 
^ «-(A)) 



then 



MA \'-Po)b is pure. Conversely, if Y is atomic, 
Rf is a purification of the state |w)^ :~ 

(cIa |5'''^'')aa- Since Rr and '^^^^ are both purifica- 
tions of the same state, by the uniqueness of purification 
stated by Lemma 21 we have Ry = |^^^'')aa I'/'o)b for 



some pure state (/?o G 6i(B) and some reversible channel 
'2^ G Gab- Since vl>(^) is dynamically faithful for system 
A, this implies f = |(^o)b- ■ 

When system B is trivial, we have the more specific 
result: 

Corollary 28 A channel from K to K is atomic if and 
only if it is reversible. 

Proof. Special case of Corollary ^ with B = I. ■ 

The states-transformations isomorphism also allows 
one to prove that the sets of transformations, channels, 
reversible channels, and pure states are compact with 
respect to the operational norm induced by optimal dis- 
crimination: 



Corollary 29 The set of physical transformations 
T(A, B) is compact in the operational norm. 

Proof. By Theorem H^, we have dim(TR(A, B)) < 
dim((3R(BA)), namely transformations span a finite- 
dimensional vector space. Since we are in finite dimen- 
sions, to prove compactness it is enough to prove that the 
set of transformations is closed. To see this, suppose that 
{'j^n} is a Cauchy sequence of transformations. By defi- 
nition, each transformation ^„ arises in some test, which 
can be taken to be binary without loss of generality. Let 
{'^n,^7i} be such a binary test, and let {R<g^,Rs>^} be 
the corresponding preparation-test. Since the set of all 
states (3(BA) is compact (by hypothesis it is finite dimen- 
sional and closed), there is a subsequence {R^^^^ , R&n^. } 
converging to a binary preparation-test {Rp, Ri\. Now, 
since each test {i?<^„^ , } satisfies Eq. (156), also 
{i?o,i?i} satisfies it. By the states-transformations iso- 
morphism, this implies that there is a binary test l"^, ^} 
such that R{) = R ^^ and Ri = R^. Finally, using the 
bound of Eq. (126) we see that '^nk (and hence "^n) con- 
verges to in the operational norm. ■ 

Corollary 30 The set of channels from A to B is com- 
pact in the operational norm. 

Proof. Again, since we are in finite dimension, it is 
enough to prove that the set of channels is closed. Let 
{^n} be a Cauchy sequence of channels. Since the set of 
transformations is closed, the sequence converges to some 
transformation Moreover, ^ is a channel. Indeed, 
since is a channel we have (ejg ^„ = (e|^, and, for ev- 



ery state /9, (e|g ^ |p)a = lim„_ 
which implies (e|g ^ = (e|^. 



(ele^n \p)a = (e|p)A, 



Corollary 31 The group Ga of all reversible transfor- 
mations of system A is a compact Lie group. 

Proof. Let {'^n} be a sequence of reversible channels 
converging to some channel We now show that "t^ is 
reversible. Indeed, consider the sequence Since 
the set of channels is compact, it is possible to choose 
a subsequence {^„~^^} that converges to some channel 
^. But now we have '^^ = lim^^yoo ^ru/^nl^ = -^A, 
and, W = limfe_,oo '^„";^'^n, = J^a |4|, that is, is 
reversible and ^ = This proves that Ga is closed, 

and; therefore, compact. Finally, since Ga is compact 
and has a faithful finite-dimensional matrix representa- 
tion, it is a Lie group (see e.g. Theorem 5.13 of Ref. 



Corollary 32 The set of pure states of system A is com- 
pact. 

Proof. Let {(/3„} be a sequence of pure states converging 
to some state p. We now prove that p is pure. To see this, 
let us fix a pure state (po . By Lemma Bfl for every n there 
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is a reversible channel ^„ such that = ^„V3o. Since 
the group Ga is compact, we can take a subsequence 
{'^"fc} that converges to a reversible channel '% . There- 
fore we have p = lim„_>oo = limfc_>oo ^nt^o = ^o- 
Since p is connected to a pure state by a reversible chan- 
nel it must be pure. I 

We conclude this Subsection with two results that will 
be useful in the construction of deterministic tclcporta- 
tion: 

Corollary 33 (Existence of a twirling test) In a 

(convex) theory with purification there always exists a 
twirling test {pi'^i}iex (according to Def. [7^ , where 
{pi} are probabilities and {"^i} are reversible channels. 
In particular, one of the channels {^i} can be always 
chosen to be the identity. 

Proof. Let dW be the normalized Haar measure over 
the compact group Ga, and define the channel ^ := 
J dW W, which is clearly a twirling channel, since by 
invariancc of the Haar measure one has = ^ ioi 

every ^ € Ga- Since the reversible channels span a 
finite-dimensional space, their convex hull is a finite- 
dimensional convex set. Then by Caratheodory's the- 
orem the integral can be written as a finite convex 
combination of reversible transformations, i.e. ^ = 
J2iexPi'^i- Since '^3^ = we can pick an outcome 
io, and apply thus obtaining a new twirling test 

where one channel is the identity. ■ 

Corollary 34 (Uniqueness of the invariant state) 

For every system A, there is a unique state XA invariant 
under all reversible transformations in Ga- Moreover, 
XA is internal. 

Proof. Let 5^ be the twirling channel defined in the 
previous Corollary. Since for two arbitrary pure states 
■0, ip' there is a reversible channel such that ip' = ^ ip 
(Lemma 20), this implies 



^{tlj') = y ^'^V' = J d#" Wt/J = 3r{ip) := X, 

(161) 

having used the invariance of the Haar measure. Now, 
since the twirling channel is constant on pure states, it 
is constant on every state, namely ^{p) = x for every 
p. In particular, if p is an invariant state, then we have 
p = S^{p) = X- This proves that the invariant state is 
unique. Finally, Corollary ^ implies that the integral 
^{p) can be written as the sum of the transformations 
of a twirling test containing the identity, namely 



X 



Y,%-'%p (162) 



B. Entanglement breaking channels 

An interesting consequence of the states- 
transformations isomorphism regards the identification 
of measure- and-prepare channels and entanglement 
breaking channels , the latter defined as follows 

Definition 48 (Entanglement-breaking channel) 

A channel ^€ G X(A, B) is entanglement breaking if 
the output state '^|(t)ac separable for every state 
a G 6i(AC), namely 

\o)ac = 5Ip»I/30b|p.)c, (163) 
for some separable preparation-test {piPi (SD PijigX; Pi S 

ei(B),p, e©i(A). 

The following Theorem extends to arbitrary theo- 
ries with purification the characterization of entangle- 
ment breaking channels presented in quantum theory by 
Horodecki, Shor, and Ruskai in Ref. js^ : 

Corollary 35 (Structure of entanglement- 
breaking channels) In a theory with purification, 
the following are equivalent 

1. is entanglement-breaking 

2. R<g is separable 

3. ^€ is measure- and-prepare 

Proof. (i) (2) If 'lo is entanglement-breaking, 
then in particular |i?^)g^ = ^^|^^^^)aa separa- 
ble. (2) (3) Suppose that R'lg is separable, namely 
i?<^ = X^igxPi/^i ® Pi- for some separable preparation- 
test {p,/3, 0p,}iex (with G ei(B) and p, G ©i(A)). 
Now, the preparation-test {pipijigx has the property 



p,p, = (e|B |i?^)BA = (e|A 1^^^^) := Ix)a, (164) 



BA 



|*(^))aa. and the fact 



whence Pi^p belongs to the refinement set of x for 
every state p. This proves that x is internal. ■ 



having used that \R'£) 
that ^ is a channel. Applying the first item of Theo- 
rem |l^ with B = I, we then deduce that PiPi = Rai for 
some suitable observation-test {at] on A. Considering 
the measure-and-prepare channel 2! := X^igx (^jIa 
we then obtain R^ = R<g, which implies 'lo = '2 . Hence, 
is measure-and-prepare. (3) ^ (1) li^ \s measure- 
and-prepare, it is easily seen that it is entanglement- 
breaking. ■ 



C. Completeness of theories with purification 

As a consequence of the states-transformations isomor- 
phism, in a theory with purification we cannot enlarge the 
set of transformations without enlarging the set of states. 
Indeed, we can compare different theories that have the 
same set of systems in the following way: 
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Definition 49 (Inclusion of theories) The theory Q' 
is larger than the theory Q if for every couple of systems 
A, B one has T(A, B) C T'(A, B), where T'(A, B) denotes 
the set of all transformations from A to B allowed by Q' . 

Then we have the foUowing 

Lemma 26 (Maximality of theories with purifica- 
tion) Let Q be a convex theory with purification, and 0' 
be a convex theory with the same sets of normalized states 
of 9, i.e. 6i(A) = ©i(A) for every A. // 6' is larger 
than e, then 6' = 6. 

Proof. First of aU, note that the deterministic ef- 
fect, uniquely defined by the condition (e\p)A = l,V/9 G 
©i(A) is the same in both theories. Now suppose that 
{^'}igx is one of the tests from A to B allowed by theory 
Q'. Let {R'^i}i<£X be the corresponding preparation-test 
for system BA, as defined by the state-transformations 
isomorphism of Theorem ^ Since the theories &' and 
Q have the same states, each R<^> is also a state in 0. 
Now, convexity of the set of states implies that {i?<g"}j;gx 
is a legitimate preparation-test in 0. Moreover, we have 

E.6x(e|BK/)BA = (e|B |'f'^')BA := I'^)a- Then, by 
Theorem there must be a test {"^^ijigx from A to B, al- 
lowed by theory 9, such that i?<^^ = i?<^^ := % |**^'')aa- 
Since |^'^*^)aa dynamically faithful for system A, this 
implies 'r^l = for every i S X. Therefore, 0' and 
have exactly the same tests. ■ 

The states-transformations isomorphism has also the 
very strong consequence that any transformation that is 
"mathematically admissible" can be actually realized as 
a test. To make this statement precise, let us give the 
following definitions: 

Definition 50 (Positive transformation) A transfor- 
mation e Tr(A,B) is positive if for every p G S_|_(A) 
one has ^€\p)k £ S+(B). 

Definition 51 (S-positive transformation) Given a 
system S, a transformation '^/f G 'Xm(A,B) is S-positive if 
"Ta (El J^s positive. 

Definition 52 (Completely positive transforma- 
tion) A transformation '^if G Tr(A, B) is completely pos- 
itive ( CP) if it is S-positive for every system S. 

Definition 53 (Admissible instrument) An admis- 
sible instrument with input A and output B is a collection 
of CP transformations {'^i}i^x such that 



lA • 



(165) 



Theorem 18 (Completeness of theories with pu- 
rification) In a convex theory with purification every 
admissible instrument from A to Ji is a test. In par- 
ticular, every admissible instrument from A to I is an 
observation-test. 

Proof. Call the theory under consideration, and con- 
sider the set of all admissible instruments that are con- 
ceivable in 0. This set is closed under parallel/sequential 
composition and under coarse-graining and conditioning. 
Therefore this set defines a new theory 0' that is larger 
than 0. Moreover, by construction 0' and have the 
same states. By Lemma |2^, this implies 0' = 0. ■ 

Corollary 36 (Characterization of physical trans- 
formations) In a convex theory with purification the fol- 
lowing are equivalent 

1. is a physical transformation from A to B 

2. is a CP transformation from A to B and (e|^ — 
{e\^'if is CP. 

Proof. The direction 1 2 is obvious. Conversely, 
suppose that condition 2 is satisfied, and define the CP 
transformations (a|^ := (e|^ — (e|^"^ and ^ := |/?)b (a|^ 
where |/3)b is some normalized state of system B. Then 
the collection of CP transformations j*^, is an admis- 
sible instrument. By the completeness of Theorem |l^ 
this implies that j*^, !^} is a test allowed by the theory. 
Hence, ^ is a physical transformation. ■ 

We are now in position to prove a stronger result than 
Lemma namely the fact that a theory with purifi- 
cation is completely specified once we have declared the 
states for every system: 

Theorem 19 (States specify the theory) Let 0, 0' 

be two convex theories with purification. If and 0' have 
the same sets of normalized states, then 0' = 0. 

Proof. Given two theories 0, 0' with the same set of 
states we can take the new theory 0U0' that is generated 
by and 0' by taking sequential and parallel composi- 
tion of the corresponding CP transformations. Since by 
construction U 0' contains and 0' and has the same 
sets of states by Lemma 26 we have = U 0' 



0'. 



The following Theorem establishes that every admissi- 
ble instrument must be feasible in a convex theory with 
purification: 



We conclude this Subsection by discussing the impli- 
cation of the no-restriction hypothesis of Def. |l^ and of 
Lemma |ll|, which states that every element in the dual 
cone of states is proportional to a possible effect. In this 
case, we have the following characterization: 

Lemma 27 In theory satisfying the no-restriction hy- 
pothesis of Def. the following are equivalent: 

1. ae Tk(A,I) is CP 

2. a is an element of the dual cone ©+(A)* 



34 



3. a is an element of the cone £+(A) 

Proof. 1 2. Any CP transformation from A to I de- 
fines a unique element a of the dual cone ©-|-(A)*, via the 
relation a{p) :— |p)a- In fact, ^ and a are identified: if 
two CP transformations ^ and define the same effect, 
then we also have ("^ (g) J^c) W)ac = C^' S?) -^0) W)ac for 
every system C and for every state p 6 ©(AC). There- 
fore '^^ = and we can identify ^ with a. 2 => 3. By 
the consequence of the no-restriction hypothesis stated 
by Lemma if a is in the dual ©+(A)* then a is in 
€+(A). 3 1 By definition, an element of £+(A) is 
proportional to an effect (with a positive proportionality 
constant). Now every effect is a physical transformation 
from A to and physical transformations are by defi- 
nition CP. ■ 

Definition 54 (Effect-valued measures) An admis- 
sible instrument from A to 1 is an effect- valued measure 
(EVM), that is, a collection of effects {a.jjigx such that 

E»ex("»lA (^Ia- 



Definition 57 (Purification-preserving channels) 

A channel G 'J(A, B) is purification-preserving for 
p G S(A) if there is a recovery channel ^ € T(B,A) 



such that ^'^\^p) An = |1'p)ar, 
arbitrary purification of p. 



nth G ©i(AR) 



In the context of error correction, the purifying system 
R will be referred to as the reference. 

Definition 58 (Correlation-erasing channels) 

A channel G T(A, B) is correlation-erasing for 
p G 6(A) if there is a state a G ©(B) such that 



p)AR 



\t)b\p)r, where G ©i(AR) 



IS an 



arbitrary purification of p, and p is the complementary 
state \p)n ■= (ej^ l^'p) 



The completeness Theorem 18 now implies 



Corollary 37 (Characterization of observation-tests) 

In a convex theory with purification every effect-valued 
measure is an observation-test. If the no-restriction 
hypothesis of Def. ^^holds, every probability rule (collec- 
tion of positive functionals that sum to the deterministic 
effect) is an observation-test. 

Finally, the characterization of Corollary ^ becomes: 

Corollary 38 In a convex theory with purification satis- 
fying the no-restriction hypothesis of Def. the follow- 
ing are equivalent 

1. ^ is a physical transformation from A to B 

2. is a CP transformation from A io B and is 
normalization non-increasing, i.e., (e|g^|p)A < 
(e|p)A for every p G ©(A). 



pjAR- 



In a theory with purification, the interplay between 
these four definitions is the basic underlying structure 
of error correction. The simplest relations can be im- 
mediately recovered from Theorem |^, which related the 
equality upon input of p to the equality on a purification 
of p. 

Corollary 39 A channel is correctable upon input of p 
if and only if it is purification-preserving for p. 

Corollary 40 // a channel is correlation- erasing for p, 
then it is a deletion channel upon input of p. If local 
discriminability holds, the converse is also true. 

Another simple fact about error correction, which 
holds in all theories with purification, is the following 



Lemma 28 If a channel^S" G T(A, B) is correctable upon 
input of p £ ©i(A) with recovery channel Sf, , and '21 G 
D'^ is a transformation in the refinement set of ^ (Def. 
I^jy, then '3 is correctable upon input of p, with recovery 



channel . 



I.e. 



012 



pJ'A for some probability p > 0. 



XI. ERROR CORRECTION 

A. Basic definitions 

Here we give some basic definitions that will be used 
in the next Subsections. 

Definition 55 (Correctable channels) A channel 
'I0 G 'J(A,B) is correctable upon input of p E ©i(A) 
if there is a recovery channel G T(B,A) such that 
=p • If P is an internal state, we simply say 
that is correctable. 

Definition 56 (Deletion channels) A channel G 
T(A, B) is a deletion channel upon input of p £ ©i(A) if 
there is a fixed state a G ©i(B) such that"^ =p |cr)B (ej^. 



Proof. By definition, since ^ is in the refinement set of 
there is a test {^ij^gx such that = and '^^ ~ 
J2iex^i- Since is correctable with recovery channel 



' . This means that 



one has J^a —p = Siex 
the test {^^i}i£x is non-disturbing upon input of p. B 
the "no-information without disturbance" Theorem |l 
one then has I^^i =p PiJ^A for every i G X 



B. Error correction and the complementarity 
between correctable and deletion channels 

We now discuss some necessary and sufficient condi- 
tions for the correctability of channels. The simplest case 
is that of channels from a system to itself: 

Theorem 20 A channel from A to A is correctable if 
and only if it is reversible. 
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Proof. Clearly, li'io = ^ S Ga one can correct ^ by 
applying . Conversely, suppose that is correctable 
with some recovery channel S^. Let = X^iex 
a refinement of where each '£i is an atomic trans- 
formation. Then, the composition {^"^ij^gx is a non- 
disturbing test, and Theorem |l^ implies Si'^i = ■pi.fh- 
Since ^ is a channel, applying the deterministic effect 



we obtain (e|^^'^i = (e 



Vi (c|ai that is, '£i is 



proportional to an atomic channel . By Corollary |2^, 
an atomic channel from A to A is reversible. Therefore, 
we have S^'^i — ^a, which implies = for every 

i. Hence, all channels must be equal, and one has 
= for some reversible channel G Ga- B 

Wc now give necessary and sufficient conditions for er- 
ror correction in the general case of channels from A to 
B. The following condition was presented in the quantum 
case in Refs. 54-56|. 



Theorem 21 (Factorization of reference and envi- 
ronment) A channel 6 T(A, B) is correctable upon 
input of p if and only if there are a reversible dilation 
Y e T(A,BE) of and a purification |^p)ar of p such 
that systems E and R remain uncorrelated. Diagrammat- 
ically, 



(166) 



where a is some state of E and p is the complementary 
state of p on system R. 



a channel, we then obtain Eq. ( ]166D . Conversely, sup- 
pose that Eq. (|l66| ) holds for some dilation y and some 
purification |^'p)ar- Then take a purification of cr, say 
■^^ e 6i(EF). Since r\'^p)m. and |«'p)ar |«'a)EF are 
both purifications of |(t)e |p)r, by Lemma El^ we have 



(170) 



for some channel & £ T(B,FA). Applying the determin- 
istic effect on E and F and defining ^ := (e|p we then 
obtain 



(171) 



By Theorem |7[ this implies ^ o =p ^a, namely ^ is 
correctable upon input of p. ■ 

An immediate consequence of the factorization Theo- 
rem ^ is: 

Corollary 41 (Complementarity of purification- 
preserving and correlation-erasing channels) A 

channel ^ S T(A, B) is purification-preserving for p G 
Si (A) (according to Def. [5^ if and only if its comple- 
mentary channel G T(A, E) is correlation- erasing for 
p (according to Def. 58). 



Proof. Suppose that 'r^ is correctable upon input of p 
with some recovery channel Then, by Theorem |^ we 
have 





A 




B 




A 






R 
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and, inserting two reversible dilations for and re- 
spectively. 



r 



= ^ 



(168) 



This means that |5'p)ar is a purification of |^p)ar- 
Then, Lemma fol ensures that |^'p)ar is of the form 



(169) 



where is some pure state on EF. Applying the de- 
terministic effect on FA and using the fact that 5^ is 



Proof. By corollary ^ '^^ is purification-preserving for p 
iff it is correctable upon input of p and, by the previous 
Theorem, iff Eq. (166) holds. But Eq. (166) is the 



definition of '^/f being a correlation-erasing channel for p. 

m 

In a theory with purification, since the global evolu- 
tion of system and environment is reversible, it would be 
natural to expect that if no information goes to the en- 
vironment, then the whole information about the input 
state is contained in the system. While this intuition is 
correct in theories with local discriminability (see Ref. 
p7[ for the quantum case) , in general theories this situa- 
tion is trickier. Indeed, as we will see in the following, in 
a theory without local discriminability some information 
can remain "locked" in the global state, in a way that 
makes it inaccessible both from the system and from the 
environment separately. 

Corollary 42 (Complementarity of correctable 
and deletion channels) // a channel ^€ G 1(A, B) is 
correctable upon input of p G (3i(A) (according to Defi- 
nition [5^ j, then its complementary channel G T(A, E) 
is a deletion channel upon input of p ( according to Defi- 
nition^^. If local discriminability holds, the converse is 
also true. 
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Proof. Direct consequence of corollaries ||, |l|, and 
■ 

Counterexample. We show that in a theory without 
local discriminability the complementarity between cor- 
rectable and deletion channels does not hold. Consider 
the case of quantum mechanics on real Hilbert spaces, 
and consider the isometry V from a real qubit to two 
real qubits defined by 



F=|$+)(0| + |vl/_)(l| 



with |$+) 



|0)|0)+|1)|1) 
V2 



and l^*-) 



|o)|i) 



(172) 

Hi)|o) 



V2 



In this case the complementary channels '^{p) := 
TYiiVpV] and ~ Ti2[VpV^] are both deletion 

channels: indeed, one has 

h 



for any real density matrix p. 



(173) 



Error correction with one-way classical 
communication from the environment 



Here wc briefly discuss a more general kind of correc- 
tion, in which the environment is not completely inac- 
cessible, but rather some operations on it are allowed. 
Particularly interesting is the case of LOCC operations, 
which do not require the exchange of systems from the 
environment, but only communication of outcomes and 
conditioned operations. In particular, we will focus here 
on the case of a single round of forward classical com- 
munication from the environment to the output system. 
With the term "classical" we mean that only outcomes 
arc communicated. 

Definition 59 (One-way correctable ciiannels) 

A channel G T(A, B) is one-way correctable wpon 
input of p if for every dilation Y G T(A, BE) there is 
an observation-test {flijigx on E and a collection of 
recovery channels {Mi}i,£x C T(B, A) such that 
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// p is an internal state, we simply say that ^€ is one-way 
correctable. 

The following theorem states that one-way correctable 
channels are nothing but randomizations of correctable 
channels. The quantum version of it was given by Gre- 
goratti and Werner in Ref. ]6^ . 

Tiieorem 22 (Characterization of one-way cor- 
rectable channels) A channel 'lo £ 'J(A, B) is one-way 
correctable upon input of p & (3i(A) if and only if is a 
the coarse-graining of a test {"^ijigx where each transfor- 
mation is correctable upon input of p. In particular, 
if p is internal, then is a randomization of correctable 
channels. 



Proof. Suppose that ^ is one-way correctable upon 
input of p. Defining the test {%}i^x by % := {ai\^Y, 
and using Theorem ^ we then obtain I^li'^i =p PiJ^A- 
Therefore, "t^ is the coarse-graining of a test where each 
transformation is correctable upon input of p. Moreover, 
if p is internal, using the fact that each is a channel, 
we obtain 



;% =Pi (e| 



A ' 
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namely each must be proportional to a channel, say 
'^i = Pi3>i, with channel correctable upon input of 
p. Conversely, suppose that ^ = X^iex ^'^^ some test 
{^toi} where each transformation is correctable upon 
input of p. Dilating such a test, we then obtain a channel 
S T(A,BE) and an observation-test {cijigx on E such 
that 
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for every outcome i e X. Since each is correctable 
upon input of p, knowing the outcome i S X, we can per- 
form the recovery channel for , thus correcting channel 
■ 

In the case of channels from A to itself, the above the- 
orem takes the simple form 

Corollary 43 A channel G T(A) is one-way cor- 
rectable if and only if it is a randomization of reversible 
channels. 

Proof. Just combine Theorem ^ with the characteriza- 
tion of correctable channels from A to A (Theorem pO[). 



XII. CAUSALLY ORDERED CHANNELS AND 
CHANNELS WITH MEMORY 

In Ref. Beckman, Gottesmann, Nielsen, and 

Preskill introduced the notions of semicausal and semilo- 
calizable quantum channel for the purpose of studying the 
constraints on quantum dynamics of bipartite systems 
imposed by relativistic causality. Subsequently, Eggeling, 
Schlingemann, and Werner proved the equivalence 
between semicausality and semilocalizability (see also 
Ref. Q for an extensive discussion on the topic). The 
same notions were generalized to the case of multipartite 
channels by Kretschmann and Werner in Ref. |6ll . From 
different points of view Refs. H, |l|, m studied the 
structure of multipartite causal channels, showing that 
they can always be realized as sequences of channels with 
memory. In this Section we show that all these results, 
originally obtained in quantum mechanics, actually hold 
in any causal theory with purification. 

Unfortunately, the nomenclature used in the literature 
is not fully consistent if we go from bipartite to mul- 
tipartite channels p3i. In order to have a consistent 
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nomenclature, instead of "semicausal" and "semilocaliz- 
able channel" we use here the plain expressions causally 
ordered bipartite channel and sequence of two channels 
with memory, respectively. 

Definition 60 (Causally ordered bipartite chan- 
nel) A bipartite channel '^/f from A1A2 to B1B2 is 
causally ordered if there is a channel S> from Ai to Bi 
such that 



Diagrammatically, 



Ai 




Bi 









(177) 



(178) 



Eq. ( |l7q ) means that the channel does not allow for 
signaling from the input system A2 to the output system 
Bi. In a relativistic context, this can be interpreted as 
Bi being outside the causal future of A2. 

Definition 61 (Sequence of two channels with 
memory) A bipartite channel '£ from A1A2 to B1B2 
can be realized as a sequence of two channels with mem- 
ory if there exist two systems Ei , E2 , called memory sys- 
tems, and two channels '^i S T(Ai,BiEi) and G 
T(A2Ei,B2E2) such that 



Diagrammatically, 



(179) 



Ai 




Bi 


A2 


B2 







-^1 


Bi A2 


^2 


El 





(180) 



A. Dilation of causally ordered channels 

For causally ordered bipartite channels the dilation 
theorem implies the following result: 

Theorem 23 (Causal ordering is memory) A bipar- 
tite channel from A1A2 to B1B2 is causally ordered if 
and only if it can be realized as a sequence of two chan- 
nels with memory. Moreover, the channels '^i,'^2 in Eq. 
(18C ) can be always chosen such that '^2^1 is a reversible 
dilation of 



Proof. If Eq. ( |180D holds, the channel is clearly 
causally ordered, with the channel & given by ^ := 
(elg^*^!. Conversely, suppose that 'rf is causally or- 
dered. Take a reversible dilation of ^, say Y S 
1(AiA2, B1B2E), and a reversible dilation of ^, say 
1^1 G T(Ai, Bj Ei) Now, by definition of causally ordered 
channel (Eq. (17S) )we have 



Ai 



r 



El 



(181) 



This means that Y and i^i (S) J'K^i are two reversible di- 
lations of the same channel. By the uniqueness of the 
reversible dilation expressed by Lemma 24 we then ob- 
tain 



Ai 




Bi 


A2 


r 


B2 






E 



^1 



(182) 



E1A2 



CD 



Once we have defined E2 := EE1A2 it only remains to 
observe that the above diagram is nothing but the thesis, 
with ^\ = "fx and ^2 = ^ ■ By construction, '^2'^! is a 
reversible dilation of ^. ■ 

The definition of causally ordered bipartite channel is 
easily extended to the multipartite case as follows: 

Definition 62 (Causally ordered channel) An N- 

partite channel 'if^^'' from Ai . . . Ajv to Bi . . . B^r is 
causally ordered if for every k < N there is a channel 
'^('"^ from Ai . . . Afc to Bi . . . B^ such that 



A, 



A, 



(183) 



The definition means that the output systems Bi . . .B^ 
are outside the causal future of any input system A; with 
I > k. 

Causally ordered channels can be characterized as fol- 
lows: 

Theorem 24 (Causal ordering is memory for gen- 
eral A^) An N -partite channel ^S^^^ from K\ . . . Aat to 
Bi...BAr is causally ordered if and only if there ex- 
ists a sequence of memory systems {^k]k=o Eo = 



I and a sequence of channels {'fk}k=ij with % G 
T(AfcEfe_i, BfcEfe) such that 



Ai 



Bi 



(184) 



Ai 




Bi A2 




B2 




El 


E2 







Ejv-1 



N 



Moreover, Ym ■ ■ ."V\ is a reversible dilation of ^ . 

Proof. It is trivial to see that if "^(^^ is a sequence of 
channels with memory, it is a causally ordered channel. 
Here wc prove the converse. For A^ = 1 the thesis is just 
the dilation theorem for channels. We now show that if 



38 



the thesis holds for iV, then it has to hold also for iV+ 1. 

is a causal channel, we have in particular 



(e|B„,, = ^^^^^ ® 



(185) 



This means that "^(^+1) can be viewed as a bipartite 
causally ordered channel from C1C2 to D1D2, where 
Ci := Ai.-.Aat, C2 := Aat+i, Di := Bi...BAr, and 
D2 := Bat+i. Then Theorem ^ yields two channels 
e T(Ci,DiFi) and W2 € S:(C2Fi, D2F2) such that 



Ci 



Di 



Fi 



^2 



(186) 

Now, applying the deterministic effect on D2, and using 
Eq. (185) the above diagram implies also that #1 is a 
dilation of On the other hand, by the induction 

hypothesis 'if^^^ has a reversible dilation 'f'^^') of the 
form of Eq. (184), namely 



(187) 



for some sequence of channels {^k)k=i ^ 
T(AfcGfc_i, B;;Gfe) and some sequence of memory 
systems (Gfc)f=o' ^^^'^ I. Since #1 and ^^^^ are 

reversible dilations of the same channel, the unique- 
ness of the reversible dilation of Lemma |2j implies 
Wi = (e|Q ^rW, with ^ e 1{Gn,GnFi) of the 



form of Eq. (149). Then, the thesis follows by defining 
the memory systems as 



Efc := { GjvFi 
GjvF2 



k < N 
k^N + 1. 



(188) 



and by defining the channels as 




k < N 
k^ N 
k = N 



1. 



By construction, the channel f^N+i'^N 
versible dilation of the channel I 



(189) 



.ii is a re- 



Moreover, since the realization of the previous Theo- 
rem is just the reversible dilation of '^'■■^\ we have the 
uniqueness result: 

Corollary 44 (Uniqueness of the reversible dila- 
tion) Let {rjf^i, % e T(AfeEfe_i,BfcEfe) be a re- 
versible realization of the causally ordered channel 'i^(^) 
as a sequence of channels with memory, as in Theorem 
1^. Suppose that {r^'lf^^, r^f G 'X:(AfeE'j^_i, B^E'J is 
another reversible realization of "^(^^ as a sequence of 
channels with memory. Then there exists a channel 3^ 
from Ejv to E^ such that 



Bi 
E'l 



N 



^2 



N 



E' 



(190) 



Proof. The channels t := fN ■ ■ -fi £ 

1(Ai...AAr,Bi...BArEjv) and r' := . . .fi £ 
T(Ai . . . Aat, Bi . . . BatE^) are two reversible dilations of 
the channel "^^^^ The statement is the direct applica- 
tion of the uniqueness of the dilation stated by Lemma 




and references therein for the definition of the problem). 
A proof in the general case is given by the following: 

Corollary 45 (No perfectly secure bit commit- 
ment) In a theory with purification, if an N -round pro- 
tocol is perfectly concealing, then there is a perfect cheat- 
ing. 



B. No bit commitment 

Sequences of channels with memory can be used to 
describe sequences of moves of a given party in a crypto- 
graphic protocol or in a multiparty game (see Ref. ||6^ for 
the case of quantum games). In this scenario, the mem- 
ory systems are the private systems available to a party, 
while the other input-output systems are the systems ex- 
changed in the communication with other parties. In 
this context, the uniqueness of the realization of a causal 
channel directly implies the impossibility of tasks like un- 
conditionally secure bit commitment (sec Refs. [p5l pq] 



Proof. We first prove the impossibility for protocols 
that do not involve the exchange of classical informa- 
tion. Let ^o,M G 'J(Ai . . . , AAr,Bi . . . BAr_iBArFAr) be 
two causally ordered A^-partitc channels (here the last 
output system of the causally-ordered channels is the 
bipartite system BjvFat), representing Alice's moves to 
encode the bit value b = 0,1, respectively. The sys- 
tem Fat is the system sent from Alice to Bob at the 
final phase of the protocol (called the opening) in or- 
der to unveil the value of the bit. If the protocol is 
perfectly concealing, then the reduced channels before 
the opening phase must be indistinguishable, namely 
(e|p^ £/q = (e|p^ £/i := ^. Now, take two reversible 
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dilations % G 1(Ai . . . , Aat, Bi . . . BatFatGo) and Yi G 
T(Ai . . . , AtvjBi . . .BatFjvGi) for and s^i, respec- 
tively. Since and Yi are also two dilations of the chan- 
nel there is a channel M from FatGo to F^rGi such 
that Yi = S^Yq. Applying this channel to her private 
systems, Alice can switch from Y^ to Y\ just before the 
opening. Discarding the auxiliary system Gi, this yields 
channel s^x. The cheating is perfect, since Alice can play 
the strategy Yq until the end of the commitment, and 
decide the bit value before the opening without being de- 
tected by Bob. The above reasoning can be extended to 
iV-round protocols involving the exchange of classical in- 
formation. Indeed, classical messages can be modelled by 
perfectly distinguishable states, while classical channels 
can be modelled by measure-and-prepare channels where 
the observation-test is discriminating, and the prepared 
states are perfectly distinguishable. The fact that some 
systems can only be prepared in perfectly distinguishable 
states will be referred to as the "communication inter- 
face" of the protocol ||6^, |6j. In this case, to construct 
Alice's cheating strategy we can first take the reversible 
dilations fo, Yi and the channel ^ such that Yi = SiY^. 
In order to comply with the communication interface of 
the protocol, one can compose Yq and Y\ with classical 
channels on all systems that must be "classical" before 
the opening, thus obtaining two channels and '3\ that 
arc no longer reversible, but still satisfy = ^Sq. Dis- 
carding the auxiliary system Gi and, if required by the 
communication interface, applying a classical channel on 
Fn, Alice then obtains channel s^i. Again, this strategy 
allows Alice to decide the value of the bit just before the 
opening without being detected. I 



XIII. DETERMINISTIC PROGRAMMING OF 
REVERSIBLE TRANSFORMATIONS 



more demanding in terms of resources: indeed to pro- 
gram a certain number of reversible transformations one 
needs to have an equal number of perfectly distinguish- 
able program states. This theorem is the general version 
of the quantum no-programming theorem by Nielsen and 
Chuang 

Theorem 25 (No perfect deterministic program- 
ming of reversible channels without distinguish- 
able program states) Let {^i}i£x be a set of reversible 
channels on A, and {jyijigx be a set of pure states o/B. 
// there exists a channel G T(AB, A) such that 



GO- 



_ A 



(191) 



then the states {rji}i^x are perfectly distinguishable. 

Proof. Take a dilation of with pure state (po G Si (C) 
and reversible channel G 'J(ABC). Upon defining the 
pure states ipi := rji® (^o we have 



BC 



BC 



(192) 



Since this is a dilation of the reversible transformation 
, by the uniqueness of the reversible dilation stated by 
Theorem |l^ there must be a pure state -0^ G (3i(BC) 
such that 



BC 



BC 



BC 



(193) 



By applying ^ on both sides of Eq. (193), one has 



In Section VIII we saw that transformations can be 
stored into states, in such a way that they can be re- 
trieved at later time with non-zero probability of suc- 
cess. This provides an instance of probabilistic program- 
ming., in which a state plays the role of program for a 
transformation, and a suitable machine is able to read 
out the program and to reproduce (with some probabil- 
ity) the correct transformation. Of course, one would 
like also to have deterministic programmable machines, 
which correctly retrieve the transformations with unit 
probability. We now show that such machines are much 



and, applying ^ 
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BC 
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BC 



BC 
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BC 



(195) 



Composing Eqs. (|l93|) and (195) we then obtain 



BC 



BC 



BC 



(196) 



BC 



This means that we can obtain an unbounded number of and ^ . Now, if and '^j are different, the proba- 
copies of and "^^^^ by iterating the application of 
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bility of error in discriminating between them using N 
copies should go to zero as N goes to infinity (this can 
be seen by repeating times the optimal test and using 
majority voting, as in the proof of Theorem Q2|). Since 
programming the transformations {(^i O )'*^} and 
discriminating among them is a particular way of dis- 
criminating between the program states {^Pi}, the lat- 
ter must be perfectly distinguishable. Finally, since the 
states (fii = rii(E)ipo are perfectly distinguishable, also the 
program states rji must be so. ■ 

Note that trying to use mixed program states {pi} can- 
not help in reducing the number of perfectly distinguish- 
able states needed in the program system B. Indeed, 
suppose that pi is the following mixture pi = -Pj^Vj*''- 
Since reversible transformations are atomic, this means 
that each pure state i/'j''' must work as a program for 
But the above theorem implies that, whichever choice we 

make, the pure states {(p^*^}igx must be perfectly distin- 
guishable. 



XIV. PURIFICATION WITH CONJUGATE 
SYSTEMS 

A. Conjugate purifying systems 

All the results derived so far were consequence of the 
sole fact that every state has a purification, unique up to 
reversible transformations of the purifying system. We 
now add more structure, by introducing the notion of 
conjugate purifying systems: 

Postulate 2 (Conjugate purifying systems) For 

every system A t/iere exists a conjugate purifying system 
A such that 

1. for every state p S Si (A) there is a purification '^p 
in 6i(AA) (completeness for purification) 

^. A = A (symmetry) 

3. AB = AB (regularity under composition) 

The above postulate could be derived from more basic 
assumptions. However, we will not discuss this issue here, 
and, for the moment, the existence of conjugate systems 
will be taken Postulate. 

Conjugate purifying systems have particularly nice 
properties, some of which are given in the following: 



Lemma 30 If the pure state ^ G 6i(AA) is dynamically 
faithful for system A, then it is dynamically faithful for 
system A. 



Lemma 29 Let A be the conjugate system of A. 
dim6R(A) = dim6M(A). 



Then, 



Proof. Trivial consequence of the bound on dimensions 
given in Eq. ( |9^ and of the symmetry condition A = 

A.m 

In a theory with conjugate purifying systems, the dy- 
namically faithful pure states considered in Subsection 
VII C enjoy the following symmetry property: 



Proof. Let a) be the marginal of on system A, namely 
P)a ~ (^Ia I^)aa- Since 'I' is dynamically faithful for 
system A, the map r : €h(A) Span(Z?i>) defined by 
(a|^ 1-^- Itq)^ = (a|^ I^)aa injcctive (and surjective, 
by definition). This implies dim Span (1?^)) = dim£i{(A). 
On the other hand, using the previous Lemma one has 
dim i£r(A) = dim6M(A) = dim6R(A). This proves that 
uj is internal in ©(A). Since 4* is the purification of an 
internal state, by Theorem || it is faithful for system A. 
■ 

Using the previous Lemma it is quite simple to show 
that conjugate systems are unique up to operational 
equivalence: 

Lemma 31 (Uniqueness of the conjugate system) 

For any system A the conjugate system A is unique up 
to operational equivalence (see Def. |^. 

Proof. Suppose that A' is another conjugate system of 
A. Then take an internal state lu G 6i(A) and consider 
its purifications * G 6i(AA) and G 6i(AA'). By the 
uniqueness of purification expressed by Lemma since 
^ and ^' are purifications of the same state, there are 
two channels G T(A, A') and ^ G T(A', A) such that 



I*')aa' = ^I*)aa 
I*)aa = ^I*')aa' 
Clearly, this implies that 



|*')aa' = "^^I^Oaa'- 



(197) 
(198) 

(199) 
(200) 



On the other hand, by the previous Lemma the states 
and are dynamically faithful for systems A and 
A', respectively. Hence, one has S''^ = and = 
J^^, , namely the channels and are reversible. By 
Definition ||, this means that A and A' are operationally 
equivalent. ■ 



B. States-transformations isomorphism for 
conjugate purifying systems 

If we use conjugate purifying systems to build up dy- 
namically faithful states some of the results derived so 
far become simpler and more elegant. First of all, ac- 
cording to Lemma if a pure state ^'aa dynamically 
faithful for system A, then it is also dynamically faith- 
ful for system A. This means that we can simply use the 
expression "dynamically faithful pure state |^)aa" with- 
out further specifications. Accordingly, we will drop the 
superscript A in the state |vl'(^) ) W e now show that we 
can also drop the condition Eq. ( p_57[ ) in the isomorphism 
between transformations and bipartite states: 
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Theorem 26 (Strong version of the states- 
transformations isomorphism) The storing map 

H> |-R'^)ba ■= '^I^)aA' "^^^^ ^ dynamically faithful 
pure state, has the following properties: 

1. it defines a bijective correspondence between tests 
{^i}igx from A to B and preparation-tests {Ri}i(£X 
for BA satisfying 



iGX 



B l-^i)BA 



(e|Al*)AA- 



2. a transformation is atomic ( according to Defini- 
tion if and only if the corresponding state R^^g 
is pure. 

3. in convex theory the map i— > R<g defines a bi- 
jective correspondence between the cones T_|_(A, B) 
and 6+(BA). 

Wc now have the following remarkable fact: 

Theorem 27 For every effect a 6 €(A) there is an 
atomic transformation S 'J(A) such that 



-Co) 



A 











(202) 



Moreover, the transformation '£a is unique up to re- 
versible channels on the output. 

Proof. Let po and p\ be the probabilities defined by 
Po := (aU (e|A I*)aa and pi := (e - aj^ (ej^ |^')aa- 
Let |^o)aa ™d |^i)aa purifications of the nor- 
malized states |po)a ■= (^Ia I*)aa/-Po and Ipi)^^ := 
[e — a\ p^\^) P^P^I P\, respectively. Now, the collection of 
states {po^OjPi^i} is a preparation-test (it can be pre- 
pared via randomization). Moreover, such a preparation- 
test has the property 



Po (e|A |*o)aa +Pi (e|A I*i)aa = (^Ia I*)aa' 



(203) 



namely it satisfies Eq. ( pOl[ ). By the states- 
transformations isomorphism, it must correspond to a 
test {"^^o/^i} from A to A: in particular we must have 



= * 



(204) 



Applying the deterministic effect on A we then obtain 



PC (aTI - 



Po *o 



-CD 



-^0 



(205) 



Since ^ is dynamically faithful, this implies Eq. ( |202| ) 
with ■= '^o- Moreover, the states-transformation iso- 
morphism states that is atomic since po |^o)aa ~ 
'rfo |*)aa pure. Finally, suppose that G 1(A ) is an- 
other atomic transformation such that Eq. (202) holds, 
and define the pure state |5'q):= "^^q \'9)^^/pq. Since ^'o 
and are purifications of the same state |po)ai then 
they are connected by a reversible channel ^ on A. Us- 
ing the fact that 4* is dynamically faithful, this implies 
(201) % = '^%. U 

Moreover, having conjugate purifying systems allows 
for a more elegant description of the composition of trans- 
formations in terms of composition of states. We recall 
that to treat the composition of states we need a system 
of purifications, as defined in Subsect. 



Vine. The nice 



thing now is that we can take the system of purifications 
to be symmetric: 

Definition 63 (Symmetric system of purifications) 

A symmetric system of purification is a choice of dy- 
namically faithful pure states |4')aa '^^^ teleportation 
effects iE\^j^ that satisfies the properties 

I^)abab = I*)aaI*)bb 
abab ~ (^^Iaa (^I BB ■ 



(206) 



Regarding the probabilities of conclu sive telepo rtati on, 
we now have pA = Pa (compare Eqs. (|ll3| ) and (114) in 
the teleportation protocol of Corollary |19|). 

In the next Subsection we will see that there is a canon- 
ical choice of internal states, namely choosing |a;)A = 
|x)a, where |x)a is the unique invariant state of system 
A (for the uniqueness, see Lemma |4|). We will choose a 
fixed purification of |x)a and refer to it as to the canon- 
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we 



ical faithful state, denoted by |^)aa- Corollary 
will show that this notation is consistent, since |$)aa 
also a purification of the unique invariant state of A. 

C. Conjugated transformations 

The most important consequence of the existence of 
conjugate purifying systems is the possibility of defin- 
ing a one-to-one correspondence between the reversible 
transformations of one system A and the reversible trans- 
formations of its conjugate system A. As we will see, this 
implies in particular the possibility of deterministic tele- 
portation. The correspondence is set by the following 
Lemma: 

Lemma 32 (Transposition of reversible channels) 

Let $ G 6i(AA) be a purification of the unique invariant 
state X G 6i(A). Then, for every reversible channel 
^ S Ga there exists a unique reversible channel 
'2^'^ G G^, here called the transpose of with respect 
to $, such that 





A 




A 
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A 









(207) 
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Transposition is an injective map satisfying the properties 



A 



(208) 
(209) 



Proof. Since \x)a is invariant, the states |$)aa ^^'^ 
|$)aa both purifications of it. Then, there must 
be a reversible transformation '^'^ € Gx such that Eq. 



(207) holds. Moreover, since the invariant state |x)a is 
internal, its purification $ is dynamically faithful, both 
for system A and for system A. Dynamical faithful- 
ness on system A implies that the transformation 
is uniquely defined, while dynamical faithfulness on sys- 
tem A im plies that transpositi on is injective. Finally, 



Eq. (20S) is obvious, while Eq. (209) is easily proved by 



■ q- ( ^ 



repeated application of Eq. (| 

{J A ® {W2Y) |<i>)AA = i^l^2 ® ^a) I*)aA 

= C^i ® %n |1>)aa (210) 

= {^A®%^'^nmAA^ 

using the fact that <I> is dynamically faithful for system 
A. ■ 

Lemma 33 (Continuity of transposition) 

Transposition is continuous with respect to the op- 
erational norm. Moreover, if C Q Ga is closed, then 
t(C) C G^ is closed. 

Proof. Let pA be the probability of tclcportation for 
the canonical faithful state |$)aa- Define \R<^)p^p^ := 
('^ (g) J^a) I^)aa- For every e > 0, if ^, r € Ga are 
such that l'^ — "^^Ia.a < e, then using Eq. (126) one has 
1^" - '^1Ia,a < - Iaa/pa < e/PA. This proves 
continuity. Now, suppose that C C Ga is a closed set, 
and suppose that {'^^} is a sequence in t(C) converging 
to some reversible transformation £ G^. It is easy 
to see that ^ must be in r(C). Indeed, consider the 
sequence {'^„} C Ga. Since Ga is compact, there must 
be a subsequence '^„^ such that ^„j^ — > for some £ 
Ga- Moreover, since C is closed, one has S C. Now, 
using continuity we obtain ^^n^ ~^ ■ This implies 
that Y = lini„^oo '^,1 = "^^j that is, the limit point is 
in t(C). Hence, t(C) is closed. ■ 

Lemm a 34 The transposition map t : ^ ^ defined 
in Eq. \20'\ ) is surjective on G^. 

Proof. Take the invariant state |x)ai ^ purification of it, 

say $'^-'^AA' ^^'^ define the transpose f with respect to 

Since r and f are both injective transformations, 
their composition t := tt : G^ — > G^ is injective too. 



Moreover, t is a homomorphism, since t(>-^A) — ^a ^"^'^ 
i{1/W) = iiCVy^W) for every r ,W in G^. We now 
claim that i is surjective. Of course, since i := rf, this 
will also prove that r is surjective. Consider the sequence 



[H„} defined by H„ := t"(G^). By the previous Lemma 
33, each H„ is a closed subgroup of G^, and one has 



G. 



Ho 3 Hi D D H„ D H 



n+l, 



(211) 



namely {H„} is a descending chain of subgroups of G^. 
Since G^ is a compact Lie group, every descending chain 
of closed subgroups must be eventually constant (see e.g. 
p. 136 of 16^), i.e. there exists a finite n such that 



H 



?i — H„+i 



n > n. 



(212) 



Applying " on both sides, this implies Hq = Hi, 
namely G^ = t(G^). Therefore, t is surjective. ■ 

The first consequences of the properties of transposi- 
tion are given by the following corollary 

Corollary 46 Let <f> e 61 (AA) be a purification of the 
unique invariant state XA S 61(A). Then the comple- 
mentary state \x)a ■~ (^Ia I^)aa unique invariant 
state of A. 



Proof. For every ^ £ Ga wc have 
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Since r is surjective, is an arbitrary element of G^, 
hence X is invariant. ■ 

Definition 64 (Conjugate of a reversible channel) 

The conjugate of the reversible channel S 'X(A) with 
respect to the state $ € (3(AA) is the reversible channel 
e T(A) defined by ^* := {^^''Y^ , where the 
transpose is defined with respect to $. 

Note that with this definition the canonical faithful 
state |^')aa i^ isotropic, i.e. it is invariant under com- 
bined reversible channels on the conjugate systems A and 
A: 
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e Ga. (214) 



Moreover, we have also the converse: 

Corollary 47 (Isotropic states) A pure state ^ € 
(3i(AA) is isotropic if and only if l^")^^ = {'^ CS" 
J^^) |$)aa some reversible 'f G Ga such that 



e Ga. 



(215) 
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Proof. Clearly, a state of the above form i s iso tropic. 
Conversely, if Vl/ is isotropic, it satisfies Eq. ( ^14| ), and, 
therefore, its marginal state on system A is the invari- 
ant state |x)a- Since ^' and $ arc purifications of the 
same state, there must exist a reversible channel "f £ Ga 
such that |^')aa ~ ^ I'^)aa- isotropy condition then 
gives 

= (^®'^*)|^')aa 

= (^r®^*)|$)AA ^^^^^ 

Dynamical faithfulness of $ then implies 'f = '%'y'%~^ ^ 
namely Eq. ( ^15|) . ■ 

Recalling that the center of the group Ga is the set 
of all elements 1^ £ Ga such that = for every 
"i/ G Ga, it is immediate to state the following 

Corollary 48 The canonical faithful state |$) 



AA 



the 

unique isotropic state of system AA if and only if the 
compact Lie group Ga has trivial center. 

The conclusion of this Subsection is summarized by the 
following theorem: 

Theorem 28 (Isomorphism of groups) The re- 
versible channels on A and A form two isomorphic Lie 
groups, with the isomorphism given by the conjugation 
map * : Ga ^ G^, ^ . 

Proof. Clearly, * is a homomorphism, namely J^^ = 

and (^1^2)* = ((^i'^2)^)"^ = %*%*■ Moreover, * is 
injective and surjective, since it is the composition of two 
injective and surjective maps, namely transposition and 
inversion. ■ 



D. Deterministic teleportation 

Lemma 35 Let .^Ta and ^ be the twirling channels on 
A and A, respectively, and let $ G (5i(AA) be the canon- 
ical faithful state. Then, one has 





A 




A 




A 











A 




A 




A 









Proof. Wc have 
=^a|$)aa 
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d^ 1$) 



AA 



d^ (^*)-l 1$)^^ 
Ga 
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having used the fact that Ga and G^ are isomorphic, 
and, therefore, have the same Haar measure. Moreover, 
since the output of the twirling channel is an invariant 
state, we have that |cr)^^ := I^)aa invariant under 
local reversible transformations, i.e. 

k)AA = ® ^) Haa g Ga, vr g G^. 

(220) 

Finally, we invoke Theorem |[ which states that the 
unique state invariant under local reversible transforma- 
tions is XA 18) Xa ■ * 

Theorem 29 (Deterministic teleportation) Let A 

and A' be two operationally equivalent systems, and let 
{Pi^i}iex be a twirling test, where each is a reversible 
channel on A. Then there exists an observation-test 
{Bi}i^x on AA' such that for every outcome i one has 



f — l-A. 



A' 



A' 



(221) 



Moreover, each effect must be atomic. 

Proof. Define the preparation-test {pi^i}i£x with 
I*^«)aa ^« I'^)aa- previous Lemma, we have 

J2iPi^i — Xa ^ Xa' namely coarse-graining of the 
preparation-test {pi^i} yields the invariant state of A A. 
By the states-transformations isomorphism of Theorem 
Pq , there exists an observation-test on A' A', say {Bijigx, 
such that 



A' 



A' 



B, = pH ^ 



A 

A 



(222) 



Clearly, the states-transformations isomorphism implies 
that each effect B^ must be atomic (indeed, the corre- 
sponding state is pure). Applying ^^^^ on system A we 
obtain 
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The thesis follows from the fact that $ is dynamically 
faithful. ■ 

In theories with local discriminability we have the ad- 
ditional result: 
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Corollary 49 Let {pi'^i}i£x be a twirling test where 
each is a reversible channel. In a theory with local 
discriminability the number of outcomes |X| cannot be 
smaller than dini(5R(A). 



Proof. By Eq. ( p2l| ) the state {%~^ ® J/Cj^ 

and the effect Bi achieve teleportation with probabihty 
Pi. In a the ory with local discriminability the bound 
of Eq. ( |116| ) gives pj < l/dini6E(A). We then have 
l = E.exP':<|X|/dim6M(A). ■ 

If two parties share the pure state |<&)aa' then by the 
teleportation protocol they can convert it in an arbitrary 
state ^ G 61 (AA) using only local operations and one 
round of classical communication (one-way LOCC). We 
now show that the state 1$)^^. the maximally entangled 
state of ©i(AA), that is, if we can convert another state 

to $ by one-way LOCC, then \1/ = ('^ =/a)$ for 
some local reversible channel G '^(A). To see that, 
we show that if allows for deterministic teleportation, 
then = ('^ ® J^a)$- 

Theorem 30 (Unique structure of deterministic 
teleportation) Let 4* e ©i(AA) be a pure state, 
{^i}iex be a collection of channels on A, {p^jigx a set 
of probabilities, and {Mi}i^x be an observation-test on 
AA', with A' and A operationally equivalent systems. If 
for every outcome i one has 




A' 



(224) 



then 



1. each channel is reversible, namely = ^ 
for some G Ga 

2. there is a reversible channel 6 Ga such that 

3. each effect Mi has the property (Mi|^^, |x)^ = 
Pi (e|A' 

4. J^iexPi'^i ~ where is the twirling channel 



Proof. Define the transformation .s/i as 

A 



A 



A 



(225) 



A 



With this definition we have — Pi-J^A for every out- 

come i. Moreover, applying the deterministic effect on 
both sides of the equality we obtain 



(gIaM = (el^^iM = Pt (gIa , 



(226) 



that is, each ^ is proportional to a channel "^i, i.e. si/i = 
Pi^i. We now have o "^^i = ,Jfp^, that means that the 
channel "loi is invertible. By corollary this implies 
that '^i is reversible, namely "^i = 'Wi for some G 
Ga- Clearly, this requires Mi = Now consider the 

marginal of 4* on system A: one has 

A 

A 



E 



A 



i£X 

E 
E 



(227) 



P . (jy 
= (Yy^ 

having used the invariance of x- But this means that 
4* and $ have the same marginal on system A, and, 
therefore, |4')aa = (^A €5 '^) |$)aa for some suitable 

\ we can also transfer on 
= (^^ ® ^a) I*)aa- Using 



G G^- Using Lemma 
system A, getting |4')aa 
s^i = pi^i we then get 
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By the states-transformations isomorphism, this means 
that each Mi is atomic (indeed, the corresponding state 
is pure). Applying the deterministic effect on system A, 
the above equation also implies 

(jy^ 



Pi $ 




(229) 



which amounts to saying (Mil^^ Ix)a 



Pi (eU: 



because 

$ is dynamically fa ithfu l. Moreover, summing over the 
outcomes in Eq. ( ^28D we obtain i^iPi'^i)]^) p^x ~ 
\x)a |x)a ~ ^ I*^)aa- Again, since $ is dynamically 
faithful, this implies ^iPi^i = 5^. ■ 

In a theory with local discriminability one has also the 
following result: 

Corollary 50 Let |4')aa' {^J^ex, {ftjiex, and 
{Mi}i^x be the state, the recovery channels, the prob- 
abilities, and the observation-test in a deterministic tele- 
portation protocol, as in Theorem \3(\. In a theory with 
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local discriminability the number of outcomes satisfies the 
bound |X| > dim(3R(A). The bound is achieved if and 
only if Pi = 1 / dim ©R ( A) for every i, and the states 
I^«)aa ■~ ("^i ® ■^a) I^)aA' * G X are perfectly distin- 
guishable with the observation-test {Mi}, i.e. 



3'AA 



(230) 



Proof. From Eq. (115) we have Pi < 1/ dim6R(A) for 
every i, and, therefore 1 < |X|/ dim6R(A). Clearly, the 
bound is achieved if and only if pi — l/dim6R(A) for 
every i. In this case, it can be seen from the proof of Eq. 
(115) that one has (M^j^'i)^^ = 1. Since {A/i}igx is 
an observation-test, and the probabilities of all outcomes 
must sum up to unit, this implies (Mj| ^'Oaa ~ ^r*- ' 

The above Corollary shows that if teleportation has the 
minimum possible number of outcomes |X| = dim (5r(A), 
then dense coding is possible: By acting locally on one 
side of the state ^ one can produce dim6R(A) perfectly 
distinguishable states. This number exceeds the max- 
imum number of perfectly distinguishable states avail- 
able in system A, which must be strictly smaller than 
dim6R(A) due to CoroUary ^ However, we didn't 
prove here the existence of such a teleportation scheme 
with |X| = dim©R(A). This issue, which is closely re- 
lated to the topic of discrimination in theories with pu- 
rification, will be addressed in a future work. 



XV. CONCLUSIONS AND PERSPECTIVES ON 
FUTURE WORK 

In this paper we investigated causal probabilistic theo- 
ries with purification, and derived a surprising wealth 
of features that are characteristic of quantum theory 
without resorting to the framework of Hilbcrt spaces or 
C*-algebras. Among theories with local discriminability, 
quantum theory appears as the only known one that sat- 
isfies the purification principle. The absence of a coun- 
terexample and the amount of quantum features derived 
suggest that quantum theory could be the only causal 
theory with purification and local discriminability. How- 
ever, at the moment we do not have a derivation of quan- 
tum theory from the purification principle, and the ques- 
tion whether there are other theories satisfying the above 
postulates remains open. 

Any answer to this question would lead to an interest- 
ing scenario: If quantum theory is the only causal the- 
ory with purification and local discriminability, then the 



machinery of Hilbert spaces is a quite redundant way 
to prove theorems that in fact can be derived directly 
from basic physical notions. What is more, the general 
proofs of most theorems are simpler and more intuitive 
than the original quantum proofs. On the other hand, if 
quantum theory is not the only theory satisfying our pos- 
tulates, the existence of more general theories, that share 
with quantum mechanics the basic structure highlighted 
in this paper, is also a very fascinating perspective. More- 
over, abandoning the standard quantum formalism would 
be interesting especially in view of a possible reconcilia- 
tion with general relativity. In this direction, particularly 
appealing is the possibility of dropping causality from 
our requirements, and of working with non-unique deter- 
ministic effects. The study of non-causal theories with 
purification is expected to provide new insights toward a 
formulation of quantum gravity. Such an approach would 
be related to the informational approaches of Hardy 
and Lloyd . The study of theories with purification in 
the non-causal setting will be addressed in a forthcoming 
paper. 

Another direction of further research is the general- 
ization of the notion of subsystem. On the one hand, 
introducing classical systems in the theory and clarify- 
ing how they can be viewed as subsystems of the non- 
classical ones is expected to provide an additional struc- 
ture that will eventually contribute to the full derivation 
of quantum mecanics. On the other hand, under suit- 
able assumptions, a face of the convex set of states of a 
system can be considered as the set of states of some sub- 
system. Following this observation, we plan to consider 
information-theoretic tasks like state compression in the- 
ories with purification, by analyzing the mechanism that 
leads the state p'^^ to approach a face corresponding to 
the state space oi M < N systems. 
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